# Aurora Database Sample Queries

This document provides sample queries for the IoT Wireless Device Bulk Management Aurora PostgreSQL database using the RDS Data API.

## Database Schema

### Tables

#### `devices` Table
Stores individual device information and status.

| Column | Type | Description |
|--------|------|-------------|
| smsn | VARCHAR(100) | Device SMSN (Primary Key) |
| aws_wireless_device_id | VARCHAR(36) | AWS IoT Wireless Device ID |
| device_name | VARCHAR(255) | Device name |
| device_profile_id | VARCHAR(36) | Device profile ID |
| uplink_destination_name | VARCHAR(255) | Uplink destination |
| batch_id | VARCHAR(100) | Associated batch ID |
| created_at | TIMESTAMP | Creation timestamp |
| updated_at | TIMESTAMP | Last update timestamp |
| status | VARCHAR(50) | Device status (PENDING, PROVISIONED, UPDATED, FAILED) |
| status_details | TEXT | Additional status information |
| positioning_enabled | BOOLEAN | Positioning feature enabled |
| positioning_destination_name | VARCHAR(255) | Positioning destination |

#### `batches` Table
Stores batch processing information.

| Column | Type | Description |
|--------|------|-------------|
| batch_id | VARCHAR(100) | Batch ID (Primary Key) |
| operation | VARCHAR(20) | Operation type (create, update) |
| total_devices | INTEGER | Total devices in batch |
| processed_devices | INTEGER | Number of processed devices |
| successful_devices | INTEGER | Number of successful devices |
| failed_devices | INTEGER | Number of failed devices |
| status | VARCHAR(50) | Batch status (PROCESSING, COMPLETED) |
| s3_url | VARCHAR(500) | S3 URL of input file |
| created_at | TIMESTAMP | Creation timestamp |
| updated_at | TIMESTAMP | Last update timestamp |
| completed_at | TIMESTAMP | Completion timestamp |

---

## Query Examples

### 1. Basic Device Queries

#### Get all devices
```sql
SELECT * FROM devices 
ORDER BY created_at DESC 
LIMIT 100;
```

#### Get device by SMSN
```sql
SELECT * FROM devices 
WHERE smsn = 'your-device-smsn-here';
```

#### Get device by AWS ID
```sql
SELECT * FROM devices 
WHERE aws_wireless_device_id = 'your-aws-device-id-here';
```

#### Search devices by name pattern
```sql
SELECT smsn, device_name, status, created_at 
FROM devices 
WHERE device_name LIKE '%pattern%'
ORDER BY created_at DESC;
```

---

### 2. Device Status Queries

#### Count devices by status
```sql
SELECT status, COUNT(*) as count 
FROM devices 
GROUP BY status 
ORDER BY count DESC;
```

#### Get all failed devices
```sql
SELECT smsn, device_name, status_details, created_at 
FROM devices 
WHERE status = 'FAILED'
ORDER BY created_at DESC;
```

#### Get recently provisioned devices
```sql
SELECT smsn, device_name, aws_wireless_device_id, created_at 
FROM devices 
WHERE status = 'PROVISIONED' 
  AND created_at > NOW() - INTERVAL '24 hours'
ORDER BY created_at DESC;
```

#### Get devices pending processing
```sql
SELECT smsn, device_name, batch_id, created_at 
FROM devices 
WHERE status = 'PENDING'
ORDER BY created_at ASC;
```

---

### 3. Batch Queries

#### Get all batches
```sql
SELECT * FROM batches 
ORDER BY created_at DESC 
LIMIT 50;
```

#### Get batch by ID
```sql
SELECT * FROM batches 
WHERE batch_id = 'your-batch-id-here';
```

#### Get batch statistics
```sql
SELECT 
    batch_id,
    operation,
    total_devices,
    successful_devices,
    failed_devices,
    ROUND((successful_devices::DECIMAL / NULLIF(total_devices, 0) * 100), 2) as success_rate,
    status,
    created_at,
    completed_at,
    EXTRACT(EPOCH FROM (completed_at - created_at))/60 as duration_minutes
FROM batches 
WHERE status = 'COMPLETED'
ORDER BY created_at DESC;
```

#### Get currently processing batches
```sql
SELECT 
    batch_id,
    operation,
    total_devices,
    processed_devices,
    ROUND((processed_devices::DECIMAL / NULLIF(total_devices, 0) * 100), 2) as progress_percent,
    created_at,
    EXTRACT(EPOCH FROM (NOW() - created_at))/60 as running_minutes
FROM batches 
WHERE status = 'PROCESSING'
ORDER BY created_at DESC;
```

#### Get batch success/failure summary
```sql
SELECT 
    operation,
    COUNT(*) as total_batches,
    SUM(total_devices) as total_devices,
    SUM(successful_devices) as total_successful,
    SUM(failed_devices) as total_failed,
    ROUND(AVG(successful_devices::DECIMAL / NULLIF(total_devices, 0) * 100), 2) as avg_success_rate
FROM batches 
WHERE status = 'COMPLETED'
GROUP BY operation;
```

---

### 4. Join Queries (Devices + Batches)

#### Get devices with batch information
```sql
SELECT 
    d.smsn,
    d.device_name,
    d.status as device_status,
    d.aws_wireless_device_id,
    b.batch_id,
    b.operation,
    b.status as batch_status,
    d.created_at
FROM devices d
LEFT JOIN batches b ON d.batch_id = b.batch_id
ORDER BY d.created_at DESC
LIMIT 100;
```

#### Get failed devices by batch
```sql
SELECT 
    b.batch_id,
    b.operation,
    b.created_at as batch_created,
    COUNT(d.smsn) as failed_device_count,
    STRING_AGG(d.smsn, ', ') as failed_smsns
FROM batches b
JOIN devices d ON b.batch_id = d.batch_id
WHERE d.status = 'FAILED'
GROUP BY b.batch_id, b.operation, b.created_at
ORDER BY b.created_at DESC;
```

---

### 5. Time-Based Queries

#### Devices created in last 24 hours
```sql
SELECT 
    status,
    COUNT(*) as count
FROM devices 
WHERE created_at > NOW() - INTERVAL '24 hours'
GROUP BY status;
```

#### Devices created in date range
```sql
SELECT * FROM devices 
WHERE created_at BETWEEN '2025-10-01' AND '2025-10-31'
ORDER BY created_at DESC;
```

#### Hourly device creation rate (last 24 hours)
```sql
SELECT 
    DATE_TRUNC('hour', created_at) as hour,
    COUNT(*) as devices_created,
    COUNT(CASE WHEN status = 'PROVISIONED' THEN 1 END) as successful,
    COUNT(CASE WHEN status = 'FAILED' THEN 1 END) as failed
FROM devices 
WHERE created_at > NOW() - INTERVAL '24 hours'
GROUP BY DATE_TRUNC('hour', created_at)
ORDER BY hour DESC;
```

---

### 6. Advanced Analytics

#### Device provisioning success rate by batch
```sql
SELECT 
    b.batch_id,
    b.operation,
    b.total_devices,
    COUNT(d.smsn) as devices_in_db,
    COUNT(CASE WHEN d.status = 'PROVISIONED' THEN 1 END) as provisioned,
    COUNT(CASE WHEN d.status = 'FAILED' THEN 1 END) as failed,
    ROUND(
        COUNT(CASE WHEN d.status = 'PROVISIONED' THEN 1 END)::DECIMAL / 
        NULLIF(COUNT(d.smsn), 0) * 100, 
        2
    ) as success_rate,
    b.created_at
FROM batches b
LEFT JOIN devices d ON b.batch_id = d.batch_id
GROUP BY b.batch_id, b.operation, b.total_devices, b.created_at
ORDER BY b.created_at DESC;
```

#### Top error patterns
```sql
SELECT 
    status_details,
    COUNT(*) as occurrence_count,
    ROUND(COUNT(*)::DECIMAL / (SELECT COUNT(*) FROM devices WHERE status = 'FAILED') * 100, 2) as percentage
FROM devices 
WHERE status = 'FAILED' 
  AND status_details IS NOT NULL
GROUP BY status_details
ORDER BY occurrence_count DESC
LIMIT 10;
```

#### Device profile usage statistics
```sql
SELECT 
    device_profile_id,
    COUNT(*) as device_count,
    COUNT(CASE WHEN status = 'PROVISIONED' THEN 1 END) as provisioned_count,
    COUNT(CASE WHEN positioning_enabled = true THEN 1 END) as positioning_enabled_count
FROM devices
GROUP BY device_profile_id
ORDER BY device_count DESC;
```

---

### 7. Maintenance Queries

#### Delete old failed devices (older than 30 days)
```sql
DELETE FROM devices 
WHERE status = 'FAILED' 
  AND created_at < NOW() - INTERVAL '30 days';
```

#### Archive completed batches (older than 90 days)
```sql
-- First, create an archive table if needed
CREATE TABLE IF NOT EXISTS batches_archive (LIKE batches INCLUDING ALL);

-- Move old batches to archive
INSERT INTO batches_archive 
SELECT * FROM batches 
WHERE status = 'COMPLETED' 
  AND completed_at < NOW() - INTERVAL '90 days';

-- Delete from main table
DELETE FROM batches 
WHERE status = 'COMPLETED' 
  AND completed_at < NOW() - INTERVAL '90 days';
```

#### Update device status
```sql
UPDATE devices 
SET status = 'PROVISIONED', 
    status_details = 'Manually verified'
WHERE smsn = 'your-device-smsn-here';
```

---

### 8. Performance Queries

#### Table sizes
```sql
SELECT 
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
```

#### Index usage statistics
```sql
SELECT 
    schemaname,
    tablename,
    indexname,
    idx_scan as index_scans,
    idx_tup_read as tuples_read,
    idx_tup_fetch as tuples_fetched
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
ORDER BY idx_scan DESC;
```

---

## Using RDS Data API with AWS CLI

### Execute a query
```bash
aws rds-data execute-statement \
  --resource-arn "arn:aws:rds:region:account:cluster:cluster-name" \
  --secret-arn "arn:aws:secretsmanager:region:account:secret:secret-name" \
  --database "iot_devices" \
  --sql "SELECT COUNT(*) FROM devices WHERE status = 'PROVISIONED'"
```

### Execute with parameters
```bash
aws rds-data execute-statement \
  --resource-arn "arn:aws:rds:region:account:cluster:cluster-name" \
  --secret-arn "arn:aws:secretsmanager:region:account:secret:secret-name" \
  --database "iot_devices" \
  --sql "SELECT * FROM devices WHERE smsn = :smsn" \
  --parameters '[{"name":"smsn","value":{"stringValue":"your-smsn-here"}}]'
```

---

## Python Examples (Using boto3)

### Query devices
```python
import boto3
import json

rds_client = boto3.client('rds-data')

response = rds_client.execute_statement(
    resourceArn='your-cluster-arn',
    secretArn='your-secret-arn',
    database='iot_devices',
    sql='SELECT * FROM devices WHERE status = :status LIMIT 10',
    parameters=[
        {'name': 'status', 'value': {'stringValue': 'PROVISIONED'}}
    ]
)

print(json.dumps(response['records'], indent=2))
```

### Get batch statistics
```python
response = rds_client.execute_statement(
    resourceArn='your-cluster-arn',
    secretArn='your-secret-arn',
    database='iot_devices',
    sql='''
        SELECT 
            batch_id,
            total_devices,
            successful_devices,
            failed_devices,
            status
        FROM batches 
        WHERE batch_id = :batch_id
    ''',
    parameters=[
        {'name': 'batch_id', 'value': {'stringValue': 'your-batch-id'}}
    ]
)

if response['records']:
    record = response['records'][0]
    print(f"Batch: {record[0]['stringValue']}")
    print(f"Total: {record[1]['longValue']}")
    print(f"Success: {record[2]['longValue']}")
    print(f"Failed: {record[3]['longValue']}")
```

---

## Tips

1. **Use Indexes**: The schema includes indexes on commonly queried columns (status, batch_id, created_at)
2. **Limit Results**: Always use `LIMIT` for large tables to avoid timeouts
3. **Use Parameters**: Use parameterized queries to prevent SQL injection
4. **Monitor Performance**: Check `pg_stat_user_indexes` to ensure indexes are being used
5. **Archive Old Data**: Regularly archive or delete old records to maintain performance
6. **Use Transactions**: For multiple related updates, use RDS Data API transactions

---

## Common Use Cases

### Monitor batch progress
```sql
SELECT 
    batch_id,
    operation,
    total_devices,
    processed_devices,
    ROUND((processed_devices::DECIMAL / total_devices * 100), 2) as progress,
    status,
    created_at
FROM batches 
WHERE status = 'PROCESSING'
ORDER BY created_at DESC;
```

### Find devices that need retry
```sql
SELECT smsn, device_name, status_details, created_at
FROM devices 
WHERE status = 'FAILED' 
  AND status_details LIKE '%ThrottlingException%'
ORDER BY created_at DESC;
```

### Daily summary report
```sql
SELECT 
    DATE(created_at) as date,
    COUNT(*) as total_devices,
    COUNT(CASE WHEN status = 'PROVISIONED' THEN 1 END) as successful,
    COUNT(CASE WHEN status = 'FAILED' THEN 1 END) as failed,
    ROUND(
        COUNT(CASE WHEN status = 'PROVISIONED' THEN 1 END)::DECIMAL / 
        COUNT(*) * 100, 
        2
    ) as success_rate
FROM devices 
WHERE created_at > NOW() - INTERVAL '30 days'
GROUP BY DATE(created_at)
ORDER BY date DESC;
```
