# Quick Start Guide - Phase 1 Validation

## Files Overview

```
voice_agent_validation.py  - Main implementation (1300 lines)
├── Config                 - Configuration class
├── ValidationState        - State enum
├── ExtractionResult       - LLM extraction results
├── ValidationContext      - Context tracking
├── CustomerDatabase       - Simple CSV database
├── LLMFunctions           - LLM extraction & generation
├── ValidationStateMachine - State machine logic
└── VoiceAgentWithValidation - Main agent class

data/customers.csv         - Sample customer database
test_client.py            - Test client
README_PHASE1.md          - Comprehensive documentation
```

## Quick Start (5 steps)

### 1. Install Dependencies

```bash
pip install websockets loguru python-dotenv openai scipy numpy
```

### 2. Start vLLM Server

```bash
python -m vllm.entrypoints.openai.api_server \
    --model ./models/Qwen/Qwen2.5-7B-Instruct \
    --port 8000
```

### 3. Create .env File

```bash
cat > .env << EOF
VLLM_BASE_URL=http://localhost:8000/v1
VLLM_MODEL=./models/Qwen/Qwen2.5-7B-Instruct
VIENEU_MODEL_DIR=vieneu-0.3B
CUSTOMER_DB_FILE=./data/customers.csv
WS_PORT=8765
EOF
```

### 4. Run Agent

```bash
python voice_agent_validation.py
```

Expected output:
```
======================================================================
🎙️  ELECTRICITY CALL CENTER - PHASE 1: VALIDATION
======================================================================
✅ ASR loaded
✅ LLM client connected
✅ VieNeu loaded with default voice
✅ Loaded 10 customers from database
✅ Voice agent listening on ws://0.0.0.0:8765
```

### 5. Test (in another terminal)

```bash
# Interactive test
python test_client.py

# Automated scenarios
python test_client.py auto
```

## State Machine Quick Reference

```
State                     | Trigger                | Next State
--------------------------|------------------------|---------------------------
GREETING                  | Auto                   | AWAIT_PHONE_REQUEST
AWAIT_PHONE_REQUEST       | Phone extracted        | COLLECTING_PHONE
                          | Extraction failed      | AWAIT_PHONE_REQUEST (retry)
COLLECTING_PHONE          | User confirms          | AWAIT_NAME_REQUEST
                          | User rejects           | AWAIT_PHONE_REQUEST
AWAIT_NAME_REQUEST        | Name extracted         | COLLECTING_NAME
                          | Extraction failed      | AWAIT_NAME_REQUEST (retry)
COLLECTING_NAME           | User confirms          | VALIDATING_USER
                          | User rejects           | AWAIT_NAME_REQUEST
VALIDATING_USER           | DB lookup complete     | VALIDATION_COMPLETE
VALIDATION_COMPLETE       | -                      | (Phase 2: Intent Detection)
```

## LLM Functions

### extract_phone_number(speech) → ExtractionResult

**Input**: "Số tôi là 0901234567"

**Output**:
```python
ExtractionResult(
    success=True,
    value="0901234567",
    confidence=0.95,
    needs_confirmation=False
)
```

**Handles**:
- Speech artifacts: "uh", "à", "thì"
- Hesitation: "cho tôi nhớ... 0901..."
- Multiple numbers: picks user's phone

### extract_name(speech) → ExtractionResult

**Input**: "Tôi là Nguyễn Văn An"

**Output**:
```python
ExtractionResult(
    success=True,
    value="Nguyễn Văn An",
    confidence=0.92,
    needs_confirmation=False
)
```

**Handles**:
- Vietnamese diacritics preserved
- Titles: "tôi là", "tên tôi là", "con tên là"
- Partial names detected and rejected

### generate_response(state, context) → str

**Example**:
```python
state = ValidationState.COLLECTING_PHONE
context.phone_number = "0901234567"

response = "Số điện thoại của quý khách là 0901234567. Đúng không ạ?"
```

## Customer Database

### Sample Data

```csv
phone_number,customer_name,account_id,registration_date
0901234567,Nguyễn Văn An,KH001,2023-01-15
0912345678,Trần Thị Bình,KH002,2023-02-20
```

### Lookup

```python
db = CustomerDatabase("./data/customers.csv")
customer = db.lookup("0901234567")
# → {'name': 'Nguyễn Văn An', 'account_id': 'KH001', ...}
```

### Name Matching

```python
similarity = db.fuzzy_match_name("Nguyen Van An", "Nguyễn Văn An")
# → 1.0 (exact match after normalization)

similarity = db.fuzzy_match_name("Nguyen Van A", "Nguyễn Văn An")
# → 0.85 (partial match)
```

## Validation Outcomes

### 1. Existing Customer (Verified)

```python
context.customer_status = "existing_verified"
context.account_id = "KH001"

# Response: "Xin chào anh/chị Nguyễn Văn An. Hệ thống đã xác nhận..."
```

### 2. Existing Customer (Name Mismatch)

```python
context.customer_status = "existing_mismatch"
context.database_name = "Nguyễn Văn An"

# Response: "Số này đã đăng ký với tên Nguyễn Văn An..."
```

### 3. New Customer

```python
context.customer_status = "new_customer"
context.account_id = None

# Response: "Hệ thống chưa có thông tin. Muốn đăng ký không?"
```

## Error Handling

### Retry Logic

```python
Config.MAX_RETRY_PHONE = 3
Config.MAX_RETRY_NAME = 3

# After 3 failed attempts:
state = ValidationState.ESCALATE_TO_HUMAN
# Response: "Xin lỗi, tôi sẽ chuyển máy cho nhân viên..."
```

### Confidence Thresholds

```python
Config.PHONE_CONFIDENCE_THRESHOLD = 0.7
Config.NAME_CONFIDENCE_THRESHOLD = 0.6

# If confidence below threshold → ask for confirmation
# If confidence above threshold → skip confirmation
```

## Common Issues

### Issue: Phone extraction fails

**Cause**: User didn't say phone number

**Solution**: State machine retries with clearer prompt:
- Attempt 1: "Xin lỗi, chưa nghe rõ. Nói lại?"
- Attempt 2: "Vui lòng nói từng số một?"
- Attempt 3: Escalate to human

### Issue: Name matching too strict

**Solution**: Adjust threshold in `CustomerDatabase.fuzzy_match_name()`
```python
if similarity > 0.7:  # Lower from 0.8
    return "existing_verified"
```

### Issue: LLM generates wrong format

**Cause**: LLM not following JSON format

**Solution**: System prompt emphasizes JSON + parsing handles markdown:
```python
if "```json" in response:
    response = response.split("```json")[1].split("```")[0]
```

## Logging Output

```
📍 State: AWAIT_PHONE_REQUEST | User: số tôi là 0901234567
✅ Extracted phone: 0901234567 (confidence: 0.95)

📍 State: AWAIT_NAME_REQUEST | User: tôi là Nguyễn Văn An
✅ Extracted name: Nguyễn Văn An (confidence: 0.92)

🔍 Validating: 0901234567 | Nguyễn Văn An
📊 Name similarity: 0.98
✅ Customer verified: KH001

📊 Context:
{
  "phone": "0901234567",
  "phone_confirmed": true,
  "name": "Nguyễn Văn An",
  "name_confirmed": true,
  "status": "existing_verified",
  "retries": {"phone": 0, "name": 0}
}
```

## Performance

- ASR: ~0.5-1s per 5s audio
- LLM extraction: ~0.3-0.5s
- Database lookup: <0.01s
- TTS synthesis: ~0.5-1s
- **Total**: 2-3s per turn

## Next: Phase 2

After validation completes, extend to intent detection:

```python
if validation_sm.is_complete():
    # Get validated context
    context = validation_sm.get_context()
    
    # Detect intent
    intent = await detect_intent(user_speech, context)
    
    # Route to handler
    if intent == "BILLING_INQUIRY":
        handler = BillingHandler(context)
        await handler.process(user_speech)
```

## Configuration Tips

### For Better Extraction

```bash
# Use larger LLM
VLLM_MODEL=./models/Qwen/Qwen2.5-14B-Instruct

# Lower temperature (more deterministic)
TEMPERATURE=0.1
```

### For Faster Response

```bash
# Reduce max tokens
MAX_TOKENS=100

# Shorter audio chunks
CHUNK_DURATION=3
```

### For Better Voice Quality

```bash
# Use specific voice
VIENEU_VOICE_ID=ngoc_huyen
```

## Support

See comprehensive docs: `README_PHASE1.md`
