blog | projects | resume

It started with this tweet.

Screen_shot_2012-01-02_at_5
... and then I found myself on the couch during New Year's Eve with a really bad movie. Laptop out...  in the wee hours of the New Year, I released a new web service for Madison. It provides a simple JSON wrapper around Madison's recently released data for parking availability on the isthmus. I added it to the list of services available in the SMSMyBus API.

http://smsmybus.com/api/v1/getparking

I was greeted in the morning by a lot of great feedback so I hooked it up to the SMSMyBus interfaces - SMS, xmpp and email - to create an app for it. For example, you can find the percentage of open spaces in downtown lots by texting 'parking' to the service...

Parking-app
... but what I really hope for, is that others take this and build something creative with it.

Although it's great to get so much positive feedback from the Madison community on these little projects, it's really (really) important to recognize that the work I've done with the API is going in the wrong direction in many ways. I'd rather be building the apps on top of services created by Madison. Eventually, this great city of ours needs to stop building websites and start producing the core feeds that I've done. That's the foundation. That is the government as a platform.

Filed under  //   gov2.0   madison   opengov   projects  

Comments [0]

Nearly two months ago, I wrote that I had canceled the data plan on my mobile phone. I'm reporting back that it has been a smashing success. I love being connected... to the people around me rather than the interwebs.

If you're disciplined about keeping the phone in your pocket, good for you. If you're not, this might be an option for you.

The experiment was given an extra boost when two weeks into the experiment I promptly lost my non-smart phone. Brilliant move in retrospect. I avoided the nervous, fidgety grabbing of the phone to look to see if a new message arrived. One might say that I went cold turkey, and it likely made the whole transition easier. So easy, in fact, I often questioned whether I needed any kind of phone at all!

That's not to say that the data plan isn't missed. My expectations were pretty spot on. I miss Google Maps and posting photos for the family. Here are the few notes I've jotted down along the way...

The pot

Let's get it right out there... I miss having reading material when I'm on the pot. If you have a smart phone, you know what I'm talking about.

My calendar

I miss access to my calendar... I haven't missed big meetings, but I have missed those non-vital but still important events that aren't necessarily on my radar every day. I've looked into the Google API to get access via SMS and will try to implement this one.

Twitter

I'd like to find a better desktop tool for Twitter. Now that I see less of it throughout the day, it would be nice to find a tool that helps me catchup on some feeds I don't want to miss.

Email

Email becomes less important - which is good. Everyone talks about the email tax where you can't control inbound email. I've found that by sending less and reading less frequently, I've been able to lower the burden of managing email. It now comes in well controlled bursts - those blocks of time that I dedicate to my inbox. I've also become ruthless with unsubscribe options.

Better focus

I haven't quantified this, but it feels like I have better focus through less distraction. If you're not pulling your phone out all the time to check in on your online life, you've improved your chances of focusing on a specific task whether it's work, a game with the kids or cleaning up around the house.

You all have become more annoying

I'm now way more annoyed by people who choose their phones over me. Whether it's to take a phone call, respond to a text message and check the sports scores, it's nothing short of annoying. I don't know how much I did this to other people two months ago, but I'm glad I don't do it anymore.

 

 

Filed under  //   family   observations   quantified-self  

Comments [11]

This post revisits my earlier evaluation of Google App Engine's post-preview pricing changes and how it affected my project, SMSMyBus. As I noted, the app was projected to cost between $6 and $7 dollars per day under the new platform pricing.  
 
Since that post, I’ve been rolling out small, incremental changes to optimize the code and combat all of the known issues. I’m thrilled to report that I have the price down to $0/day. And I’m once again impressed by the snappy and reliable App Engine platform.

Looking back on the changes, I can say that I was doing some bad things, some abusive things, and App Engine was making some bad choices as well. But the end result proves that if its developers optimize and do smart things, they are rewarded. App Engine remains a great solution for my transit API service.

Here’s a history of the changes over the last two months that got me down to $0/day...

Platform Configuration

1. Instance allocation
The new pricing model charges applications based on their use of instances (hardware resources where your application is running) rather than CPU utilization. A key to keeping your instance cost down is to simply reduce the number of instances that are spinning. Duh. So I grabbed the instance slider in the application settings and yanked it to the left. This doesn't prevent scaling, it just limits my billing for normal traffic flow.

2. Delete data
App Engine data storage (for your database) costs $0.008/GByte-day. Doesn’t sound too expensive, but I had been storing every single API call I had ever gotten. I thought it would be useful for API developers and for analytics. My drive to $0 outweighed that, however, so I deleted all of the history data and got under the free quota for storage.

Application Configuration
3. Memcached the application's route listings
I was surprised to find that I wasn’t doing this already, but there it was. I have a data structure that maps bus routes and bus stops to scheduling data on the Metro website and it never changes. In some cases - like the static calls from the kiosk clients - I was looking up route listing details in the datastore once every minute!! Fail. I used memcache to keep the common queries in memory and avoid the extra datastore reads.
 
4. Limit access during off hours
One thing that never changes is when the Metro service is running. There are five+ hours a day where the buses aren’t on the street. But some clients are still asking for data. I stubbed out most of the API during these off hours before the code ever gets close to making a datastore or memcache call.

These four changes brought me down to $0.70 per day. Bam!

Algorithm Changes

5. Asynchronous screen grabs
If you don’t know, behind the API curtain is an ugly screen scraping task that extracts the arrival estimates from the Metro website. So when a client requests arrival data for a stop, the app goes off and requests multiple web pages, machine-reads the information and aggregates all of the results.

The original implementation of the SMS interface did this by creating multiple tasks (one for each route traveling through the respective stop). When a task ran, it stored the results in the datastore. An aggregator task would read those results out of the datastore and piece together the response to the caller.

When the API was created, I couldn’t use background tasks because I had to respond with results in the same HTTP context. That’s when I discovered the great feature, asynchronous url fetch. This essentially let me grab all of the different Metro web pages at the same time. But when I implemented this, I continued to use the datastore as the mechanism for storing and retrieving results. This was just lazy. Under the old pricing, I wasn’t incented to change it other then the fact that it was a bit slow.

Under the new pricing model, this solution was very expensive. The API is continuously running this aggregation algorithm - constantly writing and reading to the datastore for model instances that have a lifespan of under a minute!

I rolled out a change that removed the use of the datastore and instead sorted the aggregated results in memory. This had a dramatic effect on my API quota for datastore reads and writes as well as overall performance and latency for my users. Especially the write operations, where you get penalized by an order of magnitude for this type of behavior because index updates work against your API quota as well.

6. Dogfood
After optimizing the API, I realized that the original SMSMyBus apps (SMS, chat, email and phone interfaces for the Metro) were now the long pole. Those apps were implemented before the API existed so they weren’t benefiting from the API optimizations. Solution... re-implement to use the SMSMyBus API.

It should have been done long ago simply as a validation exercise of the API methods. Credit to the eligence and simplicity of the API - this port was simple and only took a couple of hours.

These two changes brought me down to $0.10/day. Badda-bing.

AppStats

7. Run Appstats on all application interfaces
The last stop on the optimization train was Appstats. A truly great tool in the App Engine toolbox. In just a matter of minutes, you can find the hidden datastore operations that are dragging you down. In my case, it led me to one area that wasn’t being memcached at all. And it revealed an area that was simply using the memcache incorrectly! Love this tool...

This change brought me down $0.00/day. Winning.

Results
App Engine remains a great platform for developers that don’t abuse it and take the time to optimize their applications.

The SMSMyBus API now serves over 6,000 transit requests per day. It’s fast, reliable and flat out fun to use. I’m as proud as ever that I brought this to Madison.

Next step... find a way to fund my SMS users. :)

 

 

Filed under  //   appengine   gov2.0   opengov   programming   projects  

Comments [19]

... and here are some of the good things that I expect to happen:

  1. I'll be more engaged when I'm with my kids
  2. I'll have more money in pocket each month
  3. I'll spend more time writing
  4. I'll spend more time reading books
  5. I'll text more with my family
  6. I'll think more about the things needing thinking about
  7. I'll talk to more strangers when I'm waiting in lines
  8. I'll build more Twilio apps that give me access to apps via SMS
  9. I'll be unplugged from work

And here are the bad things that I expect will happen:

  1. When I'm traveling, I'll get lost more without the benefit of Google Maps in my pocket
  2. I'll post fewer mobile photos that I think others would enjoy
  3. I'll miss out on serendipitous information flow I get on Twitter 
  4. My Twilio bill will go up with more SMS apps getting built
  5. I'll be unplugged from work (yes, that is double-edged)

It's only been two hours but it feels good so far. Text me in two weeks and ask me how I feel then. :)

 

 

Comments [8]

6098741464_b7899aab76_b

We had another terrific Barcamp event (dubbed "MadCamp") this year. Supported by great sponsors, a great community of learners willing to share their time, and the tireless effort of Phillip Crawford to pull it all together; it was a proud day for Madison.

Urban Land Interest provided a great venue right on the Capital Square. But it wasn't without its challenges. MadCamp sessions were spread out across three floors - with six simultaneous sessions. Attendees had the challenge of getting back to the session board on the first floor and reviewing the options, finding the next room and still use the fifteen minute break to network with other attendees. To limit the burden of foot travel, we did two things.

First, we put the session board online using Google Docs and projected it on a giant wall on the first floor. This not only enabled people to view it from anywhere on their own devices, but it allowed multiple people to edit it at the same time.

Second, now that the session board was online we were able to setup a notification system that could read from Google Docs via the Google Data API. The MadCamp Notifier was born. The notifier would use the Twilio SMS API to send every (subscribed) attendee a list of the next set of sessions every hour that included the session title as well as the room number.

Steven Faulkner had the great idea of including a feedback interface so attendees could text in feedback for the individual presenters at the end of a session. A difficult habit to create in a condensed amount of time, but some folks did provide some thoughtful feedback throughout the day.

So I think there's a product here somewhere. A better way to organize and connect with others at an event. A free idea! Who's going to take it? If you want a head start, here's the MadCamp code... http://github.com/gtracy/madcamp-notifier

Incidentally, both Twilio and Google were sponsors of MadCamp so in many ways, this app was our thank you note.

Photo Credit: Pete Prodoehl

Comments [4]

Over a year ago, I smiled when I found SMSMyBus getting repurposed in the form of a kiosk at the Mother Fool's Coffeehouse.  I wrote then about the personal satisfaction of seeing a pet project growing into something bigger.

Well, last week, that satisfaction blossomed into gratification when I learned that the University of Wisconsin had included the SMSMyBus API in the latest version of their app released for the 2011 academic year. Now students using the app can get real time bus arrival estimates throughout the city rather than relying on the fixed schedule right from their iPhone and Android devices.

Super proud.

Mostly because this is exactly the kind of application that can get the attention of the Metro and the City of Madison as an example of what can be accomplished if more public data becomes accessible in an easy to consume (programmable APIs) interface.

Dear Mayor, please take note. We can achieve more, do so at a lower cost, and generate more economic development if we work together.

Filed under  //   gov2.0   madison   opengov   projects  

Comments [2]

The Google App Engine developer community is a hot mess this week over the new pricing plan for the platform. And for good reason. Many developers are seeing their hosting expenses going up by as much as 500%.

If you're looking for a post that is trashing the App Engine team, you can move along. You won't find it here. These guys are smart and considerate. If you spend any time interacting with them on StackOverflow, email or in person at Google IO, you understand this. In fact, just using the platform for a project you can appreciate their outlook and passion for their product and users. That's not to say they don't have room to improve. But enough with the negativity already!

Effects of new pricing on my projects

There have been a lot of people posting about their apps and revealing the effects of the new pricing on them. I wanted to do the same as a reference point. Note that my use of App Engine has primarily been for personal projects. Some have web front-ends, some have SMS interfaces, some are just based on background tasks and others come and go while I experiment with ideas or calendar events. I still think most of these experiments are well suited for App Engine, but I need to take a hard look at the more successful apps to figure out a long-term strategy because they are not scaling well with the new pricing plan.

I'll share two examples - both philanthropic projects - comparing the effects of the new pricing.

Astronomy Picture of the Day

This app had originally been written in Perl as a grad student and was hosted at the University of Wisconsin. I decided to port the application as the vehicle for learning App Engine and Python so it was the first app I ever wrote on the platform. It's primarily a background app. Every afternoon it runs a job that scrapes the contents of the APOD site and packages it into an email and sends it off to all subscribers. There's a simple web frontend that lets anyone sign up. There are currently 1900+ subscribers.

The app is free to run on the platform today and will cost $0.19/day - or $0.03 per user per year - after the price changes. 100% of those costs can be attributed to the use of the Mail API.

My only complaint about this app is that the change seems extreme. Going from 2,000 free emails to 100 feels like an attempt to curb the spamming community. And for the charity projects like this, all of the good net citizens are the losers with this change.

SMSMyBus

This app was originally built to provide a better interface for the Madison Metro bus service. It provides real-time arrival times for buses via SMS, chat, email and phone. But then it blossomed into a full-featured API for the Metro for other developers.

The app costs $0.01/day to operate today (excluding the SMS interface). It is estimated to cost $6.79/day after the price change. $2,478/year. Yah. That's a whopping 67,800% increase. Shebang.

The root cause of essentially all the cost can be attributed to the main API call that returns arrival times at a particular bus stop - getarrivals - and some of the clients call this repeatedly (like every two minutes). It is also where the confusion starts for me with respect to the new pricing.

Frontend instance hours

Frontend instance hours is projected to be $5.68/day, 84% of the bill. This represents the platform's transition from billing for CPU usage to billing for the contention of instance usage. I get it that they need to do this. They were using the wrong resource metric for monitoring before. 

But how do I go from a $0.00 cost for resource consumption to $5.68/day?!? That kind of increment just feels insane. How about $0 to $0.50? Or $0 to $1?

Datastore writes

Datastore writes is projected to be $1.00/day, 14% of the bill. This is harder for me to resolve for a couple of reasons. First, I can't find any cost under the current pricing plan for these operations even though the app's profile is fairly consistent. So I struggle, conceptually, with how this has suddenly become an issue for the app.

Second, $1/day equates to 1M writes/day in the datastore and I simply can't figure out where all of those writes are coming from. My back of the napkin math shows 40,000 writes. I'm totally baffled by this projection. 

The rest of it

The rest of the projected cost is a combination of storage and datastore read operations. I can eliminate the former if I simply store less data I wanted to use for analytics. It saves me money, but in the end, ignoring some of the data hurts the developers that use the API.

Optimizing

Now it's my job to go in and take another stab at optimizing the code and start with the getarrivals API call. I thought I had good habits with this so I was a little embarrassed when I found an obvious hole in the query path for route listings. There's a fairly repetitive query that was not being memcached - oops! Now fixed.

The second thing I'm experimenting with is the application's instance configuration. By default, I was letting the platform's scheduler determine my load patterns and create new instances whenever necessary. But I've made two changes. First, I took the scheduler out of 'auto' mode and set the maximum number of idle instances to one, and I've cranked up the minimum latency for the pending request queue to 250ms. In theory, each of these changes should drive the cost down because I should be using less frontend instance time throughout the day.

Let's see what happens! As I do my part with optimization, I'd like to see the App Engine do their part and move to the middle as well. :)

What to do next

I'm guessing that the App Engine platform simply priced things wrong the first time. I think the concept of platform as a service that exploits existing Google infrastructure was a smart, but geeky idea that was poorly modeled or had bad assumptions about its use/abuse. Ironically, the idea didn't scale well and they've been forced to admit that early assumptions on how to price it were just wrong.

The good part about this move... developers are forced to take a deep dive into optimization. Something i've written about before and have been doing again since the clock started on the pricing changes. This not only makes for a better platform for Google and project sharing the resources, but it makes for a better net as a whole. Faster is better.

The bad part... 

  • Developers will be forced to dead pool worthy projects that don't have a business model. 
  • Developers will be forced to port apps to other platforms. That could be a painful pill to swallow for developers when they aren't money making projects
  • Developers may be sacrificing analytics to avoid datastore bloat and access charges.

What the App Engine team should do about it

  • Provide better pricing structures for philanthropic and open source projects. App Engine is a great platform for these things and it provides a great playground for developers to support important projects at a low cost while also learning about a platform they can adopt for larger, commercial project down the road. They've hinted at this but will they do it? - http://code.google.com/appengine/kb/postpreviewpricing.html#special_programs_...
  • Provide more runway for optimization. A couple of weeks to get the sleeves rolled up and optimize their apps just isn't enough time.
  • Provide better analysis tools to highlight problems
  • Take baby steps. Must they really take these giant leaps in pricing?
  • Roll out Python 2.7 to support concurrent requests in Python projects.

In the process of writing this post I found some great resources...

 

 

 

Filed under  //   appengine   google   programming   projects  

Comments [9]

A post on my Madison Transit API homebrew is long overdue. It was an incredibly fun project and when you see how it has enabled others, you realize how much more compelling it is then the simple SMS app I had originally created.

The API has been out for a long time, but there was very little usage early on. I'm not sure what sparked the flame but in the last few months or so there has been a lot of interest and lot of activity from developers. It dawned on me that I've yet to write about it. And that's a shame because the API is actually the most clever piece of the whole SMSMyBus effort. 

In the beginning

In the beginning, there was a very specific goal of creating an SMS app that delivered real-time arrival estimates. That was it. It was in response to my own need - as a commuter - for such an app. It was also going to be an entry into the Twilio developer contest that was run the week they released the SMS API publicly. 

I knew going in that the Metro didn't provide a formal API or even an XML feed, but I'd done enough HTML screen scraping before to know how to overcome it. Little did I know how tedious that would become with the Metro data. So what started out as a simple texting application turned into a day of cursing the poorly formatted HTML I found on my screen. And every minute of the way I could be heard asking the same question, "Why doesn't the Metro have an API?!?"

When I finished, I swore that I would never let anyone else suffer the pain that I endured so decided to abstract the work I did for the texting app and provide access to my data so other developers could get to the transit data through a standard web service interface. 

Why it's clever

The ugly work of screen scraping was now done and buried in the implementation of the SMS application. But by creating a couple of other web handlers, I could easily make the same scheduling data accessible via a web service call.

For example - http://www.smsmybus.com/api/v1/getarrivals?key=nomar&stopID=1101 - will return real-time arrival estimates for routes traveling through the stop at Main and Carrol on the square. That's the way every developer wants to query data from a web service.

With a little more scraping, I could also find the physical location of every stop (latitude and longitude) so developers could query the location of stops by a stop ID and could also query all nearby stops based on a lat/long coordinate. In the end, there was a pretty impressive looking transit API - simple to use and understand for new developers. And it didn't matter if it was ugly in its technique of screen scraping inside the implementation. It had a clean interface for the developers.

Supported API methods include...

getarrivals - Get real-time arrival estimates for any stop ID with the option of filtering by route ID.

getroutes - Get a list of every route in the Metro system.

getstops - Get the details for every stop a route travels through.

getnearbystops - Get the details for all stops within a given distance of a latitude/longitude.

getstoplocation - Get the location details for a stop given a sop ID.

getvehicles (not implemented) - Get details for a particular vehicle on the road.

getservicebulletins (not implemented) - Get a list of service bulletins in the Metro system.

Why I Built It

As I noted, a big reason for building the API was simply to save the next person the trouble of building up a new set of screen scraping tricks to get the data. But I also knew that most people that have attempted this have stopped at the schedule data. I don't know anything that has gone the extra step of getting the route and stop location data as well. 

More importantly, I wanted to find a way to contribute to the larger mission of opening up a public dataset to make it more accessible. Open data systems are important for lots of reasons, but most importantly, it allows communities to operate more efficiently. And nowhere is that more evident than in transit. Every city, state, and national government in the world should be on a path towards open data right now. 

Unfortunately, Madison has been slow to find its course. My dream was that by creating the API and recruiting some more developers to use it, we'd have enough applications to take back to the city to say, "Look! If you open more data, developers will do great things!"

I also did it for the intellectual challenging of build an API. I've been an API consumer for a long time, but had never constructed one myself. It's an exercise I encourage any API consumer to go through. You'll have a whole new perspective for what it takes to build, document and support an API.

Sample Applications

In theory, all of the original interfaces - SMS, google chat, email, and phone - could have been re-implemented using this API, but I didn't go back and do that. But lots of new apps have been built. For example, the bus kiosk displays hanging in a few local businesses use the API to visualize arrival estimates for nearby stops.

Larry Walker used the API to build a Chumby app and also used it to display arrival estimates on a small LED display on an Arduino. He also built a mobile browser app to more easily access arrival estimates. And I used the API to build a Google gadget so you can get scheduling data in the sidebar of gmail. The attached gallery shows off some of these examples.

Go get started building your own Madison transit app! http://www.smsmybus.com/api/

(download)

 

 

Filed under  //   appengine   madison   programming   projects   twilio  

Comments [6]

Twilio sucked me into one of their contests this weekend. The rules were simple - build an application that leveraged their new Twilio Client, which enables anyone to build phone and SMS interfaces into the browser.

It's a beautiful extension of their existing APIs and really does get you to ask the question, "Why do I need a separate phone device to communicate with others?" 

I didn't have a killer idea or business idea to build on top of this so I decided to do something simple but make it available to anyone to use and extend. I built a browser-based phone that anyone can download and deploy to Google App Engine themselves. All you need is a Twilio account.

The phone has four features. Once you've logged into the application, you can do the following...

  • Call anyone using a dial pad right in the browser
  • Receive calls in the browser when people call your Twilio number
  • Receive calls from friends that login to the app in their browsers
  • Record voicemail if you aren't logged in to answer (and send you an SMS to let you know)

Go give me a call - http://phone.gregtracy.com

It was super fun to build and even more fun to use. If you'd like to get your own you can find the project on github - http://github.com/gtracy/phonefree-twilio. It's intended to be a turn-key solution for App Engine. Just deploy and go.

There are lots of opportunities to extend this if you're interested. I'd like to implement voicemail and SMS. I'd like to put a design on this so it looks nice. :) I'd also like to try implementing conferencing so you could have party lines right inside the browser. What else?

(download)

Filed under  //   appengine   projects   twilio  

Comments [2]

If you haven't seen it yet, find 25 minutes in your day and watch Conan O'Brien's Dartmouth Commencement speech from last week.

It's as funny as you might expect. But it's also raw - you just need to stay with it until the end. Conan lays out his passion for comedy and how it took failure for him to reinvent himself. It's brilliant. And it's very likely advice that means nothing to these twenty two year-olds. It's a speech that is written for us 30/40 year-olds. The folks that have had enough jobs to know what we don't want, but may still be searching for our true passions and goals.

Comments [0]