Screenshot of Trello board where our process is actively managed by the team.

Open Data Publishing Toolkit

 Toolkit Doc Blog Post

Project Goals

After iterating on DataSF’s approach to publishing open data, the process began to stabilize. We needed a way to capture our approach for a handful of reasons:

  1. Clarify and improve consistency in the process across the team
  2. Provide a means to transparently document updates to the process
  3. Share our approach with other organizations


The toolkit was not developed in a vaccuum. We had been developing and documenting the open data publishing program over time, experimenting with and improving on it. But documentation was scattered in various Google docs. These scattered documents also missed the forest for the trees.

We had done enough lean experimentation that we set out to cohesively describe the publishing program. It was after adopting Trello as a common tool for managing data pipelines that we were able to step back and pull together all of the pieces into a common operating document. This included describing the process and the technologies that support the process.

In the toolkit document, a process diagram and index of processes helps staff visualize how the pieces fit together.

In the toolkit document, a process diagram [pictured above] and index of processes helps staff visualize how the pieces fit together.

We chose GitBook as the documentation tool as it provided transparent authoring, version control and ease of sharing.

After completion, I announced the toolkit through a blog post. While not intended for a general audience, we found that many other governments wanted to more about open data operations as they built out their own programs. I also fundamentally believe collective learning increases with sharing.


Having the toolkit helped with a number of things:

  1. Decreased time spent explaining our operations to peers. Now we had a document we could send along and focus on answering questions that the document didn’t cover. As a side effect, we could update our documents for clarity based on questions and feedback.
  2. Improved onboarding of new staff. The document in tandem with the Trello board provided a common starting point for new staff to pick up processes. While the document is holistic, it is also modular. This allows us to zoom into pieces of the process for training without losing the bigger picture.
  3. Improved delivery consistency. With documentation and some process automation, we were able to consistently communicate with our clients about the process and what to expect based on their intake form. Standard checklists in Trello complement the toolkit and help to make sure we don’t miss anything.

Out in the world

Leader in civic data and its many uses with 10+ years experience in data management, analysis, visualization, and engineering. I helped build DataSF into a world-recognized program empowering use of San Francisco's data.
Jason Lally on Twitter