Dylan Goldblatt, Ph.D.
Kennesaw State University
Slides: unexpected.vercel.app







.png)



Transforming Unstructured Sources


| Natural Language Processing | Audio | Computer Vision | Multimodal |
|---|---|---|---|
| - Table Classification - Token Classification - Text Classification - Table Question Answering - Zero-Shot Classification - Question Answering - Summarization - Text Generation - Fill-Mask - Text2Text Generation - Feature Extraction - Text Ranking |
- Text-to-Speech - Automatic Speech Recognition - Audio Classification - Text-to-Audio - Audio-to-Audio - Voice Activity Detection |
- Image Classification - Object Detection - Image Segmentation - Video Classification - Image-to-Text - Image Generation - Unconditional Image Generation - Zero-Shot Image Classification - Depth Estimation - Image-to-Image - Super Resolution - Image Restoration - Image Classification Explanation - Document Question Answering |
- Text-to-Text - Image-Text-to-Text - Visual Question Answering - Document Question Answering - Visual Document Retrieval - Any-to-Any |
| Project | Unstructured Data | Techniques & Tools |
|---|---|---|
| U Illinois (Brown Dog) |
Images, audio, video, documents (various legacy files, “dark data”) | File format converters (DAP); content-based search & metadata extraction (DTS); web services integrating OCR, format translation, etc. |
| UW–Madison (XDD) |
Text, figures, tables from 18M+ scholarly articles (PDF/PubMed) | Natural language processing (entity tagging, parsing); OCR for scanned docs; custom text-mining applications via an API |
| CMU & Georgetown (Six Degrees of Francis Bacon) |
Textual biographies (Oxford DNB historical entries) | Named Entity Recognition (Stanford NLP, LingPipe – combined 85% recall); graph learning via Poisson Graphical Lasso for relationship inference; expert validation of results |
| Library of Congress & UW (Newspaper Navigator) |
Images + text from 16 million newspaper pages (OCR’d) | Deep learning visual content recognition (Faster R-CNN) to detect photos, cartoons, ads, maps, etc.; OCR text alignment for captions; image embeddings for similarity search |
| Northeastern (Telegram Extremism Study) |
Social media posts: images + text from Telegram channel | Cloud Vision API labels + K-means clustering for image themes; spaCy NER and Gensim LDA for text topics; statistical regression analysis (views vs content features) |
| Stanford (CheXpert) |
Clinical narratives (radiology reports) + images (X-rays) | Rule-based NLP pipeline (dictionary matching, negation detection) to label 14 conditions per report; produced labels for >220k X-ray images; used labels to train CNN diagnostic models |










| Kickstarter | Initiatives | Grand Challenges |
|---|---|---|
| Up to $5,000 | Up to $10,000 | $100,000/year |
| 1 Semester | 1 Year | 2 Years |
| Teams of ≥ 2 (at least 1 KSU faculty) | Teams of ≥ 2 KSU faculty | Teams of ≥ 3 KSU faculty |
| Potential Deliverables: publication, local exhibition/performance, small grant | Deliverable: grant with >$100k in direct costs or other significant work (e.g., performance/exhibition with regional draw) | Deliverable: large-scale, grant of >$1M in direct costs plus ≥ 2 team publications |
| Due April 10, 2025 | Due April 15, 2025 | Due April 17, 2025 |
Thank you for attending the AI Fair!
