I haven’t been able to work much for the past week due to illness and other responsibilities, but I managed to put in a nice 6hr chunk of time in yesterday and today. As a result, I’ve hooked up live reddit data from r/news, established basic story retrieval with the Node API, modified the front-end to accept the data, and created a rudimentary setup for search with elastic search. A nice uninterrupted 10p-4am can be so much better than several short bursts in the daytime sometimes!
I estimate this chunk completed maybe 10% of the front-end and 20% of the web server. I’ll do the alph ETA calculation later. Here are some shots with live data
The new stody detail page with populated data. I need to clean up the keyword extraction and images.
The front page. The reddit data mining runs periodically so there can be live updates to the site. (server push coming soon)
the elastic search powering the search page was remarkably easier than my previous approach, building from scratch with Lucene.
I’ve started building the Node back-end and already I’m beginning to see why others warn against the “callback hell.” I’ve picked up bluebird for Promises, which will helpfully reduce the callback difficulty. Mocha and Chai js are looking good as options for testing the REST API once I have the requirements stabilized, and may offer me more opportunity for test-driven development.
I’ve currently worked 19hrs on this project, with an estimated 21% completion of the front-end. That sums for a 90hr front-end build time with 71hrs to go (considering the 56hr estimate from the previous post). I’ve been working on other projects, so I haven’t been able to put enough time to complete my 2-week stretch deadline.
However, I’ve set up the Express and Flask API’s and am currently using them to retrieve placeholder data for the story pages. Here’s what the story page looks like so far:
I’m using Newspaper to generate all of the information so far, with its built-in NLP capabilities to extract keywords and summarize the text. There was never really much of a design to begin with so I’m thinking about how to display the statistics and sentiment analysis information when I get to it.
The summary also won’t be a huge block of text in the final version, but a collection of helpful snippets from multiple news sources. On the right sidebar I’ve added an area for “Related,” which may imply some content recommendation in the future. I have no click data, so recommendation will most likely be NLP-similary based (or index news sites’ recommendations maybe?).