Last year tonight @ NH7 Weekender Meghalaya 2017

Last year tonight, I was in Meghalaya for NH7 Weekender with my friends (also co-workers at the time), Deepu and Rishabh (aka Jain). We'd been aching to visit Meghalaya after watching Ethereal's "Meghalaya Alive!" video. And after hearing about Weekender happening there (with Steve Vai headlining it) there was no …

Read

There are comments.

Announcing Excalibur, a Web Interface to Extract Tabular Data from PDFs

Last week, Camelot trended at #1 on Hacker News, Github and #5 on Product Hunt. Thank you for the love! There's still a lot to do to make it more awesome. You can follow the roadmap on its Github wiki. You can also check out my previous blog post on …

Read

There are comments.

Announcing Camelot, a Python Library to Extract Tabular Data from PDFs

I originally wrote this post for the SocialCops engineering blog, and then published it on Hacker Noon.


The PDF (Portable Document Format) was born out of The Camelot Project to create “a universal way to communicate documents across a wide variety of machine configurations, operating systems and communication networks”. Basically …

Read

There are comments.

Airflow, Meta Data Engineering, and a Data Platform for the World’s Largest Democracy

I originally wrote this post for the SocialCops engineering blog, and then published it on Hacker Noon.


In our last post on Apache Airflow, we mentioned how it has taken the data engineering ecosystem by storm. We also talked about how we’ve been using it to move data across …

Read

There are comments.

How to Create a Workflow in Apache Airflow to Track Disease Outbreaks in India

I originally wrote this post for the SocialCops engineering blog, and then published it on Hacker Noon.


What is the first thing that comes to your mind upon hearing the word ‘Airflow’? Data engineering, right? For good reason, I suppose. You are likely to find Airflow mentioned in every other …

Read

There are comments.