The Cary Update #1

23 Sept Meeting

  • It may be easier to use the Cary events calendar to scrape the necessary data.
  • Looking at Python and Beautiful Soup. (I like the API, it's pretty close to JavaScript. :D )
  • Create repo in Github Code for Cary: https://github.com/Cary-Code-for-America/TheCary

Regarding storage of parsed results:

  • We talked about City Gram once and was curious how to link this up to a similar service. Would we have either one database for all our different projects or would each project have their own DB with a REST API? REST API sounds like the better option.
  • Could host it on Digital Ocean once we're ready.

Github repo changes:

  • Added my work so far… not much.
  • Creating milestones, labels, and issues… mostly to keep things straight in my head.
  • Starting to create notes in Github Wiki.

Questions for next meeting:

  • Would it make sense to have multiple data sources to scrape? Would help with getting the most accurate data. For example, scrape the main showtime page, the calendar and the Films@TheCary page. If a showtime only shows up on the calendar but not the others, we can safely drop it from the final scrape. Could be something implemented later on as well.

Posted on by Juan Orozco