Analyze Workflow
Analyze email patterns, statistics, and trends using mu queries and data processing.
Workflow Steps
1. Understand Analysis Request
Identify what the user wants to analyze:
- Volume: Email counts over time
- People: Top senders/recipients, communication patterns
- Topics: Subject patterns, keyword frequency
- Threads: Conversation analysis
- Attachments: File type distribution
- Response times: Time between emails in threads
- Activity patterns: Time of day, day of week
2. Gather Data
Use mu find with appropriate queries and JSON output:
# Get structured data for analysis
mu find <query> --format=json > emails.json
# Count emails by criteria
mu find <query> | wc -l
# Get specific date ranges
mu find date:20250101..20250131 --format=json
3. Process Data
Use shell tools to analyze:
# Top senders
mu find <query> --format=json | jq -r '.from' | sort | uniq -c | sort -rn
# Emails by month
mu find <query> --format=json | jq -r '.date | strftime("%Y-%m")' | sort | uniq -c
# Attachment types
mu find attach:* --format=json | jq -r '.attachments[].name' | sed 's/.*\.//' | sort | uniq -c
# Average email size
mu find <query> --format=json | jq '.size' | awk '{sum+=$1; n++} END {print sum/n}'
4. Visualize Results
Present findings clearly:
- Tables: Formatted counts and statistics
- Lists: Top N senders, subjects, etc.
- Summaries: Key insights and patterns
- Comparisons: Personal vs work, this month vs last month
5. Provide Insights
Interpret the data:
- Identify trends (increasing/decreasing volume)
- Highlight patterns (busiest times, top correspondents)
- Suggest actions (archive old threads, follow up on flagged items)
Common Analysis Patterns
Email Volume Analysis
Count emails by account:
echo "Personal: $(mu find maildir:/icloud/* | wc -l)"
echo "Work: $(mu find maildir:/redhat/* | wc -l)"
Count by month for the year:
for month in {01..12}; do
count=$(mu find date:2025${month}01..2025${month}31 | wc -l)
echo "2025-${month}: ${count}"
done
Unread email count:
mu find flag:unread maildir:/icloud/* | wc -l
mu find flag:unread maildir:/redhat/* | wc -l
People Analysis
Top 10 senders:
mu find <query> --format=json | \
jq -r '.from' | \
sort | uniq -c | sort -rn | head -10
Email exchange with specific person:
mu find "from:alice@example.com OR to:alice@example.com" | wc -l
Communication frequency over time:
mu find from:alice@example.com --format=json | \
jq -r '.date | strftime("%Y-%m")' | \
sort | uniq -c
Topic Analysis
Common subject keywords:
mu find <query> --format=json | \
jq -r '.subject' | \
tr '[:upper:]' '[:lower:]' | \
grep -oE '\w+' | \
sort | uniq -c | sort -rn | head -20
Emails by project (work):
for project in knative kubernetes konflux; do
count=$(mu find maildir:/redhat/${project}/* | wc -l)
echo "${project}: ${count}"
done
Attachment Analysis
Total attachments count:
mu find attach:* | wc -l
Attachment types distribution:
mu find attach:* --format=json | \
jq -r '.attachments[]?.name' | \
sed 's/.*\.//' | \
tr '[:upper:]' '[:lower:]' | \
sort | uniq -c | sort -rn
Large emails with attachments:
mu find attach:* size:1M.. --format=json | \
jq -r '"\(.size) \(.subject)"' | \
sort -rn
Thread Analysis
Thread depth (replies):
mu find <query> --format=json | \
jq -r 'select(.references != null) | .references | length' | \
awk '{sum+=$1; n++} END {print "Avg replies:", sum/n}'
Longest threads:
mu find <query> --format=json | \
jq -r 'select(.references != null) | "\(.references | length) \(.subject)"' | \
sort -rn | head -10
Temporal Analysis
Emails by day of week:
mu find <query> --format=json | \
jq -r '.date | strftime("%A")' | \
sort | uniq -c
Emails by hour of day:
mu find <query> --format=json | \
jq -r '.date | strftime("%H")' | \
sort | uniq -c | sort -k2 -n
Activity timeline (last 7 days):
for i in {0..6}; do
date=$(date -d "-${i} days" +%Y%m%d)
count=$(mu find date:${date} | wc -l)
echo "$(date -d "-${i} days" +%Y-%m-%d): ${count}"
done
Best Practices
Performance
- Use specific maildir queries to limit scope
- Process JSON output for complex analysis
- Use streaming tools (jq, awk) for large datasets
- Cache results for repeated analysis
Privacy
- Aggregate personal and work data separately
- Redact email addresses in summaries when appropriate
- Be careful with subject content in analysis
Data Quality
- Handle missing fields gracefully (use jq select)
- Account for timezone differences in date analysis
- Normalize data (lowercase, trim) for accurate counts
Tool Recommendations
Use these tools for analysis (all available in nixpkgs):
# In nix-shell
nix-shell -p jq gnugrep gawk coreutils dateutils
# Or using nix-shell shebang in scripts
#!/usr/bin/env nix-shell
#! nix-shell -i bash -p jq gnugrep gawk
jq - JSON processing and querying awk - Text processing and calculations grep/sed - Pattern matching and text manipulation sort/uniq - Counting and deduplication dateutils - Advanced date manipulation
Examples
Monthly email volume comparison:
echo "Last month: $(mu find date:1m..30d | wc -l)"
echo "This month: $(mu find date:30d.. | wc -l)"
Top work correspondents:
echo "Top 10 work email senders:"
mu find maildir:/redhat/* --format=json | \
jq -r '.from' | \
sort | uniq -c | sort -rn | head -10
Busiest email hour:
echo "Email activity by hour:"
mu find date:30d.. --format=json | \
jq -r '.date | strftime("%H")' | \
sort | uniq -c | sort -rn | head -5
Integration
This workflow often follows:
- Search workflow - Initial data gathering
- Analyze workflow - Process and analyze data
- Present findings to user
- Offer drill-down with Search or View workflows