Standup notes are where productivity often goes to die.
They live in Slack or Notion for ten minutes, get buried by the next thread, and action items quietly disappear into the “I’ll remember that later” void.
To fix this, I built a CLI tool that turns messy standup notes into structured action items using an AI standup notes parser. The tool pipes unstructured human chaos into strictly validated JSON.
Here’s the technical breakdown of the design decisions, the failures along the way, and why regex was never an option.

The Problem: Human Chaos vs. Brittle Parsers
Standup notes are messy by nature. They contain sentence fragments, implied ownership, and vague deadlines.
Example:
“Mike also needs to review Sarah’s PR by EOD today – Sarah said that’s high priority.”
Traditional parsing approaches like regex or manual rule systems break down quickly.
Why?
- Ownership is contextual
“Mike needs to review Sarah’s PR” means the owner is Mike, not Sarah. - Deadlines are relative
Phrases like “EOD today”, “next Tuesday”, or “tomorrow morning” require temporal understanding. - Priorities are subjective
“High priority” might be explicitly written – or only implied through tone.
This type of text sits in the sweet spot for LLM-based information extraction.
The Approach: Gemini + Pydantic
The stack I chose was intentionally simple:
- Gemini 1.5 Flash via the
google-genaiSDK - Pydantic v2 for schema validation
Why this combination?
LLMs for Extraction
LLMs are extremely good at mapping human language into structured entities like:
- owners
- tasks
- deadlines
- priorities
- blockers
Pydantic for Enforcement
Even when an LLM returns valid JSON, the structure might not match your schema.
Pydantic acts as the gatekeeper, ensuring every action item strictly conforms to the required format.
The Core Schema Design
Instead of parsing a single item, I defined a wrapper model.
class ActionItem(BaseModel):
owner: str = Field(..., description="The person responsible for the task")
task: str = Field(..., description="A concise description of what needs to be done")
deadline: str = Field(..., description="When the task should be completed")
priority: Literal['high', 'medium', 'low']
blocked_by: Optional[str] = None
class ActionItemList(BaseModel):
items: List[ActionItem]
Why the ActionItemList Wrapper?
The google-genai SDK’s structured output performs best when a top-level Pydantic model is provided.
Passing List[ActionItem] directly often triggers schema validation issues inside the SDK’s internal transformer.
Wrapping the list inside a model guarantees a stable schema the model can consistently follow.
Model Choice: Flash vs Pro
This problem is primarily pattern recognition, not deep reasoning.
Using a large reasoning model here would be unnecessary.
Gemini 1.5 Flash handles the task in:
- under ~2 seconds
- extremely low cost
- with excellent extraction accuracy
For this workload, using a heavier model would simply increase latency and cost without improving results.
What Actually Broke: The SDK Migration Trap
The biggest failure during development came from migrating from:
google-generativeai
to the new:
google-genai
The Error
ValueError: Unsupported schema type: additional_properties=None defs=None
The Cause
The new SDK is much stricter about Python type handling.
Initially I passed:
response_schema = List[ActionItem]
The SDK’s internal transformer could not convert this generic Python list into a valid OpenAPI schema.
The Fix
Two changes resolved the issue:
1. Introduce the ActionItemList wrapper
response_schema = ActionItemList
2. Use the correct model alias
Using:
gemini-1.5-flash
triggered 404 NOT_FOUND errors.
The working alias was:
gemini-flash-latest
This avoids routing problems across different API versions (v1beta vs v1).
Example: From Chaos to Structured Data
Input:
Sarah: I'm finishing the auth fix, should be done by Friday.
Mike: Still stuck on the database setup because infra hasn't responded.
Jane: Mockups tomorrow, aiming for next Tuesday.
Output:
[
{
"owner": "Sarah",
"task": "Finish the auth fix",
"deadline": "Friday",
"priority": "high",
"blocked_by": null
},
{
"owner": "Mike",
"task": "Database setup",
"deadline": "Not specified",
"priority": "medium",
"blocked_by": "Infra team hasn't responded"
}
]
This structured output makes it trivial to feed the data into:
- dashboards
- database
- automation pipelines
- ticket systems
- weekly summaries
When This Approach Works (and When It Doesn’t)
Use this when
You need to convert human-written notes into structured data for:
- dashboards
- notifications
- internal summaries
- lightweight project tracking
Avoid this when
You require absolute correctness.
LLMs can hallucinate or miss edge cases. If a missed task would cause catastrophic consequences, human review is still required.
Think of this system as a productivity multiplier, not a Jira replacement.
Next Steps for Production
If I were deploying this system tomorrow, I would add:
- Confidence Scoring
Ask the model to output a confidence score and flag items below 0.8 for human verification. - Retry Logic
Handle API throttling using exponential backoff for 429 RESOURCE_EXHAUSTED errors. These were fairly common during testing on the free tier. - Tooling Integration
Feed the structured JSON directly into tools like:
Linear / Notion / Jira
The CLI could automatically create tickets from standup notes.
Cost Estimate
Using Gemini Flash pricing, you could process thousands of standup notes for pennies.
For most teams, that’s potentially a very high ROI for a small automation layer.
Full AI Standup Notes Parser script
import os
import json
import logging
from typing import List, Optional, Literal
from pydantic import BaseModel, Field, ValidationError
from google import genai
from dotenv import load_dotenv
# Configure logging
logging.basicConfig(level=logging.INFO, format='%(levelname)s: %(message)s')
logger = logging.getLogger(__name__)
# Pydantic model for Action Items
class ActionItem(BaseModel):
owner: str = Field(..., description="The person responsible for the task")
task: str = Field(..., description="A concise description of what needs to be done")
deadline: str = Field(..., description="When the task should be completed (e.g., 'Friday', 'EOD today')")
priority: Literal['high', 'medium', 'low'] = Field(..., description="The priority level of the task")
blocked_by: Optional[str] = Field(None, description="What or who is blocking the task, if any")
# Wrapper model for a list of Action Items (better for SDK compatibility)
class ActionItemList(BaseModel):
items: List[ActionItem]
# Realistic, messy sample standup notes
SAMPLE_NOTES = """
Standup March 10th:
Sarah: I'm finishing the auth fix, should be done by Friday.
Mike: Still stuck on the database setup because the infra team hasn't responded to my ticket yet.
Jane: I'll start UI mockups tomorrow, aiming for next Tuesday.
Oh, Mike also needs to review Sarah's PR by EOD today - Sarah said that's high priority.
Jane's mockup work is probably medium.
And we need someone to check the logs but no one's on it yet.
"""
def extract_action_items(notes: str, api_key: str) -> List[ActionItem]:
"""
Uses Google Gemini Flash to extract structured action items from unstructured notes.
"""
# Use gemini-flash-latest for maximum compatibility and reliability across tiers
MODEL_ID = "gemini-flash-latest"
client = genai.Client(api_key=api_key)
prompt = f"""
Extract a structured list of action items from the following meeting notes.
Notes:
{notes}
"""
try:
logger.info(f"Sending notes to Gemini {MODEL_ID} for extraction...")
response = client.models.generate_content(
model=MODEL_ID,
contents=prompt,
config={
'response_mime_type': 'application/json',
'response_schema': ActionItemList,
}
)
# The SDK returns a parsed object if a schema is provided
if response.parsed is not None:
return response.parsed.items
else:
# Fallback for empty or unexpected responses
logger.warning("Gemini returned an empty or unparsable response.")
return []
except ValidationError as e:
logger.error(f"Pydantic validation failed: {e}")
raise
except Exception as e:
logger.error(f"An unexpected error occurred during API call: {e}")
raise
def main():
# Try to load from .env or environment
load_dotenv()
api_key = os.getenv("GOOGLE_API_KEY")
if not api_key:
print("\n[!] ERROR: GOOGLE_API_KEY not found in environment variables.")
print("Please set it in a .env file or export it: export GOOGLE_API_KEY='your-key'\n")
return
print("--- RAW STANDUP NOTES ---")
print(SAMPLE_NOTES.strip())
print("-" * 20 + "\n")
try:
items = extract_action_items(SAMPLE_NOTES, api_key)
print("--- EXTRACTED ACTION ITEMS (VALIDATED JSON) ---")
# Print the validated objects as a clean JSON list
output_json = [item.model_dump() for item in items]
print(json.dumps(output_json, indent=2))
print("-" * 40)
except Exception:
# Errors are already logged in the helper function
pass
if __name__ == "__main__":
main()
