/code - asktherelic.com

Connecting Apps with the Tailscale Kubernetes Operator - Part 2

2024-07-15T13:00:00-07:00

In Part 1, I explored installing the Tailscale (TS) K8S operator and adding an Ingress object to make a K8S Service available to the TS network over HTTP.

In this post, I want to cover different Egress and Ingress options and ways to connect apps on the TS network.

For more context, while I’ve been experimenting with kind and a virtual Kubernetes (K8S) cluster, I run a personal server running at home with plenty of resources and a virtual server that hosts my public web apps. Being able to connect the two servers over TS is one of my goals. It allows me to do more permanent and resource intensive things on my personal server, but also make them available publicly when I want.

For example:

Caveat: Under Active Development

The TS K8S Operator is under active development and some of these capabilities were add in v1.66. Make sure you are running the latest version.

There is even a Helm chart now for easier updating - https://tailscale.com/kb/1236/kubernetes-operator#helm

Exposing TS Apps to your Cluster

To expose existing TS Apps to your K8S cluster, you can use the ExternalName Service, which creates an TS Pod to proxy traffic.

For example, to expose the Home Server to K8S, you could create this Service:

---
apiVersion: v1
kind: Service
metadata:
  name: home-server
  annotations:
    tailscale.com/tailnet-ip: "100.X.X.1"
spec:
  externalName: placeholder
  type: ExternalName

Then you would get a K8S Service and could curl ts-home-server-lmc8s.tailscale.svc.cluster.local within your cluster:

$ kubectl get service tower
NAME          TYPE           CLUSTER-IP  EXTERN   
home-server   ExternalName         ts-home-server-lmc8s.tailscale.svc.cluster.local

Now with TS DNS names

Now with v1.66 of the Operator, using full TS domain names are supported instead of TS IPs, by running a TS managed DNS server. Official docs here.

First, add the DNS component:

---
apiVersion: tailscale.com/v1alpha1
kind: DNSConfig
metadata:
  name: ts-dns
spec:
  nameserver:
    image:
      repo: tailscale/k8s-nameserver
      tag: unstable

Then get the Pod IP and add it to your Cluster DNS config. In my K8S setup, I use coredns.

$ kubectl get dnsconfig ts-dns -o json | jq .status.nameserver.ip -r
10.43.239.128

$ kubectl edit configmaps -n kube-system coredns
...
ts.net {
    errors
    cache 30
    forward . 10.43.239.128
}
...

$ kubectl rollout restart -n kube-system deployment coredns

Now you can create your Service with a FQDN. And another new feature recently added is the tailscale.com/hostname annotation, that allows you to name the TS Proxy instead of using a random prefix, to keep things more organized on the TS side.

---
apiVersion: v1
kind: Service
metadata:
  name: homeserver-fqdn
  annotations:
    tailscale.com/tailnet-fqdn: "homeserver.bee-haka.ts.net"
    tailscale.com/hostname: "homeserver-fqdn"
spec:
  externalName: placeholder
  type: ExternalName

Creating a TS Proxy with Subnet Routes

Another TS feature is subnet routing; sharing entire CIDR routes to TS.

In my example, even though I have my Home Server directly shared on TS, I can also get to it via the Apple TV that is sharing the 192.168.1.X/24 route. Apple TV subnet routing release here.

To create a TS Proxy with subnet routings, one of the other new objects is a ProxyClass, for configuring the TS Proxy:

---
apiVersion: tailscale.com/v1alpha1
kind: ProxyClass
metadata:
    name: acceptroutes
spec:
    tailscale:
        acceptRoutes: true
---
apiVersion: v1
kind: Service
metadata:
  name: home-server-via-apple-tv
  labels:
    tailscale.com/proxy-class: acceptroutes
  annotations:
    tailscale.com/tailnet-ip: "192.168.1.1"
spec:
  externalName: placeholder
  type: ExternalName

Expose HTTPS services with TS SSL

Lastly, you can also expose TS Apps with FQDNs and automatically generated LetsEncrypt certs. I believe you could generate and download the certs manually before, this is another convenient primitive for apps that need to know their address.

For example, given an existing “registry” Service, you can expose it with an Ingress and tls params.

---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: registry
spec:
  defaultBackend:
    service:
      name: registry
      port:
        number: 80
  ingressClassName: tailscale
  tls:
    - hosts:
        - registry

$ kubectl get ingress
NAME       CLASS       HOSTS   ADDRESS                                 PORTS     AGE
nginx      tailscale   *       default-nginx-ingress.bee-hake.ts.net   80        3d3h
registry   tailscale   *       registry.bee-hake.ts.net                80, 443   3d2h

So this is the Docker Registry service that I’m running, so I can build Docker Images on my Home Server and then use them in K8S. It’s more convenient than other ways to push a Docker Image to a central shared point on TS.

Overall, these are useful primitives for bi-directional sharing between the TS network and K8S. The TS Operator has made it easier to connect my apps together securely and I’ll continue to watch for new capabilties!

Exploring the Tailscale Kubernetes Operator - Part 1

2024-02-23T12:00:00-08:00

I’m a big fan of Tailscale (TS) and the ease of use it brings to using Wireguard to securely connect servers and apps. They’ve recently been working on a Kubernetes Operator that makes it easy to integrate TS into Kubernetes (K8S). As I’ve personally been moving my internal self-hosted apps and public web apps to K8S, I’ve been curious to integrate the two.

Two big features I’m interested in and I’ll cover today:

securely connecting to the K8S apiserver over TS, for managing K8S
connecting apps over TS, for internal access via the TS network

Preface:

I’m doing this all locally via kind, but it should work on any K8S setup
Tailscale has this all documented here as well https://tailscale.com/kb/1236/kubernetes-operator

Access via Tailscale

Assuming you already have working k8s cluster and context, you need to install the TS Operator. Installing via Helm is also an option, but I chose static files.

Get the latest development version:

wget https://raw.githubusercontent.com/tailscale/tailscale/main/cmd/k8s-operator/deploy/manifests/operator.yaml

Create a new TS Oauth app, for the Operator: https://login.tailscale.com/admin/settings/oauth

Then configure the OAuth secret in the file:

client_id: XXXXXXXXXX
client_secret: tskey-XXXXX

I’m also also choosing to use the TS proxy in auth mode, which “impersonates requests from tailnet to the Kubernetes API server”. This requires some additional configuration, but seems a bit more flexible versus the “noauth” mode, which only proxies requests and requires a bit more lower-level work to have your TLS certs configured correctly.

Configure the Operator section in the file:

name: APISERVER_PROXY 
value: "true"

Apply the file to your cluster

$ k apply -f operator.yaml
namespace/tailscale created
secret/operator-oauth created
serviceaccount/operator created
serviceaccount/proxies created
customresourcedefinition.apiextensions.k8s.io/connectors.tailscale.com created
customresourcedefinition.apiextensions.k8s.io/proxyclasses.tailscale.com created
clusterrole.rbac.authorization.k8s.io/tailscale-operator created
clusterrolebinding.rbac.authorization.k8s.io/tailscale-operator created
role.rbac.authorization.k8s.io/operator created
role.rbac.authorization.k8s.io/proxies created
rolebinding.rbac.authorization.k8s.io/operator created
rolebinding.rbac.authorization.k8s.io/proxies created
deployment.apps/operator created
ingressclass.networking.k8s.io/tailscale created

Then for auth proxy, apply the RBAC roles in the most insecure fashion:

curl -s https://raw.githubusercontent.com/tailscale/tailscale/main/cmd/k8s-operator/deploy/manifests/authproxy-rbac.yaml | k apply -f -
clusterrole.rbac.authorization.k8s.io/tailscale-auth-proxy unchanged
clusterrolebinding.rbac.authorization.k8s.io/tailscale-auth-proxy unchanged

I’m using my own private tailscale network, so auth-ing myself with all privileges is easiest:

kubectl create clusterrolebinding askedrelic --user="askedrelic@github" --clusterrole=cluster-admin

Lastly, you can use Tailscale to create your K8S context and use it!

tailscale configure kubeconfig tailscale-operator

You should have K8S access using the TS context now!

There are other options for getting secure access to your apiserver via ssh forwarding or tailscale subnets on the server itself, but consolidating more configuration like this into K8S makes sense for me.

Connecting to Internal Apps via Tailscale

Lets run an app (only internally available to K8S) and connect to it over Tailscale. I’ll use the default nginx image:

k create deployment nginx --image=nginx

There are several options TS exposes. The first is creating via a Service:

---
kind: Service
apiVersion: v1
metadata:
  name: nginx-ts
spec:
  type: LoadBalancer
  loadBalancerClass: tailscale
  selector:
    app: nginx
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: 80

Assuming that creates successfully, you can get the Service details:

$ k get service nginx-ts
NAME       TYPE           CLUSTER-IP      EXTERNAL-IP                                       PORT(S)        AGE
nginx-ts   LoadBalancer   10.96.207.167   100.121.9.15,default-nginx-ts-2.bee-hake.ts.net   80:30697/TCP   9s

And it should be available via the TS IP:

$ curl 100.121.9.15
...
Welcome to nginx!
...

or generated TS device name, assuming you have TS DNS enabled locally:

$ curl default-nginx-ts-2.bee-hake.ts.net

The other option is via Ingress, which can make the app available via HTTPS with a Lets Encrypt generated cert, assuming you have HTTPS enabled in your TS settings.

---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: nginx-ts
spec:
  defaultBackend:
    service:
      name: nginx
      port:
        number: 80
  ingressClassName: tailscale
---
apiVersion: v1
kind: Service
metadata:
  name: nginx-ts
spec:
  ports:
  - name: http
    port: 80
    targetPort: 80
  selector:
    app: nginx
  type: ClusterIP

While testing this, Service/Ingress changes didn’t seem to be reconciled by the Operator, so delete/re-creating seems required for now.

And now it should be available via HTTPS:

$ curl https://default-nginx-ingress.bee-hake.ts.net
...
Welcome to nginx!
...

Convience Delivered Securely

Overall, the Operator has worked smoothly for me in testing and it has made things more convenient and secure for my network.

In Part 2, I’ll explore connecting apps together over the TS network.

Quarterly Calendar Project

2024-02-05T12:00:00-08:00

Starting out during my Recurse Center retreat, I wanted to visualize and plan my 12 weeks of time. I’ve been working on “daily streaks” for habit building and wanted a visual tracker for my habits. Building a printable quarterly calendar seemed like a good initial project.

Several other printable calendars I’ve found recently, that were inspirational:

But I couldn’t find a “quarter” or 12 week view to cover my Recurse timeframe!

Choices

My first thought was to stick with a static HTML+CSS page, but I decided to have the dates be dynamic, which requires Javascript. And once you start doing any kind of complex Javascript, it’s convenient to use React, inline CSS styling with your React components, and then it turned into a fully compiled project. Therefore, I chose,

Hugo (Go based tool) for the backend and asset compilation
Tailwind as a CSS framework
React for organizing my Javascript

Hugo was easy to learn and get started with. It has hot-reload support, which is important for HTML design work. I initially tried to use static

I wanted to inline importing Tailwind CSS, so that I could use it inside my React code. Tailwind recommends their own separate watch and compile step, like so:

npx tailwindcss -i ./src/input.css -o ./src/output.css --watch

But I mostly ignored this. I ran into caching issues in dev and hot-reloading, where 50% of the time a Hugo rebuild would not pickup my Tailwind CSS changes in a JS file. I was mostly done with the project, so I didn’t investigate further, since a full hugo build always worked. I did find several tickets with suggested fixes.

Print Debugging

The last visual debugging I did was around how the actual printed product looked! I printed several copies, as I was working, and tweaked some of the font sizing and colors for clarity. It leaves a nice paper trail of prototypes:

Learnings

Here is the finished product: https://quarterly-cal.asktherelic.com

Overall, bootstrapping and developing the app were pretty straightforward; I’ve done enough HTML+CSS before. Tailwind is a good standard framework for CSS that makes sense for me.

The hardest part was the Javascript datemath for calculating and rendering out the week numbers and spacing. I was trying to not import too many libraries, but not using a JS date library made this more difficult.

Source: https://github.com/askedrelic/quarterly-calendar/

Joining Recurse Center for W2 2024

2024-01-05T14:00:00-08:00

I’m thrilled to announce that I’ve joined the Recurse Center for the Winter 2 batch, where I’ll be immersed in a 12-week community-focused retreat dedicated to programming and self-improvement. In their words, it’s a “retreat where curious programmers recharge and grow.”

I’m fortunate to be in a stable financial position, allowing me to step outside my comfort zone and explore something new. While my career has been built on predictability and meeting expectations, I believe there’s value in venturing outside your comfort zone, especially when it aligns with your true self. My hope is to discover new passions and delve into them more deeply than I have lately. This experience brings back memories of my college days when I lived on a “special interest” dormitory floor for computers, called Computer Science House, that had a great community and environment for learning. Recurse Center reminds me of that fondly.

Day One

The Recurse Center principles prioritize self-direction and working towards personal goals. So, what are my goals?

Even drafting this blog post contributes to my objectives. Over the last few years, I’ve felt a creative block as my sharing has been confined to internal work discussions. In an introduction meeting, I heard the phrase, “the community is a resource to contribute to and draw from.” This resonates with me, and I’m eager to share my experiences on my blog and contribute to the larger community.

My broad goals include:

Blogging more frequently
Contributing to open-source projects that I use
Expanding my knowledge of low-level languages like Go and Rust

Striking the right balance between challenge and focus is a work in progress, but every day is progress. I’m excited about what I plan to accomplishment during my time at the Recurse Center and look forward to the upcoming weeks.

Hosting with Dokku

2017-01-08T13:03:00-08:00

Having a platform to share my thoughts and work has always been important to me, even when I only have time to use it sparingly. I still like blogging. This blog also has been a great learning opportunity and with its latest upgrade: it’s now running 100% over HTTPS (thanks to Let’s Encypt and being being deployed by Dokku and Docker.

Dokku?

Dokku is a free replacement for Heroku, that you can run on your own server. Without using too many acronyms, it’s an app management tool built around Docker. I started using Heroku back in 2011, when they added support for Python. The simple command line interface for app deployment and free CPU time was a great selling point.

But as costs lowered for servers and with a personal desire to control more of my own tools, for privacy and security reasons, the option to run my own became really interesting.

Some History

There have been three major versions of my blog, with different goals at different times (although using three different programming languages wasn’t the goal, just a coincidence):

WordPress, PHP, 2006 - 2010ish
Blogofile, Python, 2010 - 2013
Middleman, Ruby, 2013 - present

Middleman is still currently being used to generate this blog as a static website, but now with Dokku, it can be deployed and updated via git push, which is really easy and convenient!

Dokku Details

Docker is new hotness that makes deploying code really easy, but it’s still a very manual tool, which is where Dokku comes in. Dokku controls Docker and makes it easy to deploy whole applications; connecting databases, caching servers, or multiple servers together easily. It has a a simple command line interface and support for plugins.

So what this means to me:

It’s easier to deploy a new server now. Dokku manages app configs, so I closer to having a 12 factor app, where configuration is part of the environment (server) and not part of the app. This makes it easier to keep my source code open.
I can use plugins to setup SSL or databases correctly. The Let’s Encrypt plugin is really easy to use and makes an already easy process foolproof. I’m a big fan of PostgreSQL but I still have some legacy LAMP apps I’m running that need MySQL. These plugins make it so easy to boot a database and have it backed up automatically to S3. More things are automated, hopefully leaving more time to create new things.
Best practices and examples are easily shared. The nginx config that is running this blog comes from a github fork, that has continued to be improved overtime.

So I recommend Dokku if you are interested in running webapps easily and controlling your server. You can see this blog’s code on github, if you want to learn more.

Testing the Layers of Your Application at PyConUK 2016

2016-09-18T19:00:00-07:00

This weekend, I presented at PyConUK 2016, summarizing my recent experiences testing Python webpapps and libraries.

I’ve been writing Python for about 8 years now, mostly on a smaller scale, but the last few years at Yelp have been really interesting to see testing done at a larger scale. Testing has become really important to me, as it helps all the other pieces of your software fit together better.

It was great to be able to share what I’ve learned and brush up on my presentation skills. Unfortunately, I didn’t manage to record a video of the actual presentation, but that was also a lesson learned for when you are presenting.

Abstract

Testing is a best practice for delivering reliable software, but it can be a hard subject when starting out. What should you test and why? How much testing is enough? So you spent three days and wrote out tests for everything in your module, but was that an effective use of your time?

This talk will give an overview of the different layers that you can write tests for and why you should have them. You start with unit tests, mix in some integration tests, and cover with acceptance tests. Sprinkle with specific testing tools to taste. Tools we’ll discuss include py.test, docker, behave, tox, and coverage. Although the talk focus will be on web apps, the ideas will be relevant to all Python applications.

Writing quality tests is important: flaky tests will cost more time than they save and filler tests that don’t test important areas will weigh you down over time. With stable and effective tests for all layers, you build code you can trust, that you can refactor quickly or change easily without breaking everything. It’s as easy as cake!

Upgrading My Dotfiles To Symlinks

2014-04-21T22:12:00-07:00

Tinkering with my configuration and dotfiles is a never ending hobby. After finding Github’s guide to dotfile configurations, I evaluated several of the repos and decided to upgrade my own dotfiles.

For the longest time, my home directory (/home/askedrelic on most systems) has been a git repo. This has mostly worked but has several problems:

everything writes to your home directory; you wind up with many of untracked files, unless you ignore them, which is then a pain to keep updating your .gitignore
easy to add files; mistype a filename or tab-complete and your whole .ssh folder could get added with your private keys
impossible to keep any sensitive files (atlhough this is mostly a side effect of keeping my dotfiles public on Github)
hard to initialize on new machines; git likes to clone to an empty directory and your home directory is never empty, even on new machines

Despite these complaints, this method has worked out me for several years. It has allowed me to use default git functionality (git-submodules) for my vim plugins and easily keep them up to date. I can git-pull and have the latest configurations by re-opening my shell.

I wanted a way to upgrade my existing repo with minimal changes and keep my git-submodules. Several of the Github recommended dotfile repos were not interesting to me for their forced use of ZSH or complex Ruby/Rake scripts to handle updating. I have a bit of sunk cost with Bash and wanted the option to gradually upgrade to ZSH.

Therefore Zach Holman’s dotfiles looked best to me:

easy upgrade/install using a Bash bootstrap script with no magic
support for symlinked files or directories to allow my git-submodule use to continue
no forced ZSH configuration
great topic based organization of files

It took a few tries to figure out copying his script/bootstrap was mainly what I wanted. Moving my existing git-submodules to a new location was obtusely hard, until I found a script to handle it: https://github.com/iam-TJ/git-submodule-move.

The upgrade was basically:

git clone https://github.com/askedrelic/dotfiles/ .dotfiles
cd .dotfiles && script/bootstrap
Follow the prompts and overwrite or backup any files I am trying to symlink

This new layout allows for much better organization going forward. Check it out here and see your dotfiles could use an upgrade: https://github.com/askedrelic/dotfiles/

Another Year, Another Set Of Backups

2014-02-08T13:50:00-08:00

You might not be able to call it a New Years Resolution anyone, but it’s not too late to backup everything you did online last year. I usually get around to doing this in December/January over Christmas holiday, but this year I have been slacking. Maybe you have too. It’s not too late, make a backup!

Files

I have a couple services that I use for my documents, my writing, my photos, and everything else that is a file. Files are easiest to deal with, as long you have them organized easily. You always want your files in two places (your current computer and online), since you will drop your laptop and your current HDD will fail randomly. You want your files in a simple file format that doesn’t depend on a specific application: usually text, lossless image (PNG), or PDF. PDF might be a terrible format for technical reasons, but it’s a great way to preserve websites or Word documents exactly how they currently are. Every application has “Print as PDF”.

Dropbox is the natural choice for most temporary files. You can get 2GB to 5GB free starting out, it is constantly backed up when you are connected to the internet, and instantly available across all your devices. I keep most temporary files in my Dropbox folder, until I can organize them by date and archive them in a more permanent way.

A more permanent home solution for large file backup is an external HDD or NAS. I finally invested in a consumer NAS last year, the Synology Diskstation DS213 for about $200. Drobo is a another well known consumer NAS, but way too expensive in my opinion, especially considering how cheap hardware has become. The Synology has really impressed me in ease of use, software quality, and overall value. Dual disk RAID support, gigabit ethernet, and “app” support with automatic AWS Glacier backup are some of the more technical features that really make it impressive. Here is another good review, from the Wirecutter.

To balance everything out, you want permanent offsite backups: Backblaze or Crashplan are both good. I’ve used Backblaze for years at work, but am considering Crashplan personally, since Backblaze does not support backing up NAS. These programs are similar to Dropbox, but offer a longterm backup for around $50/year, which is a great price for safety and value of an offsite backup.

Important Online Services

Now that you have someplace to backup files, backup every important online service you used last year into a simple file format.

Email: most online email providers are now offering simple backup solutions, even Gmail.
Passwords: I also switched this year to religiously using 1Password, an application that stores all of your passwords and automatically fills them in, like when you need to access your bank account website. This has not only made my life easier, but improved the overall security of my passwords, since 1Password can generate a unique password for each website you use. I can write volumes about this topic, but for now, get some application that stores your passwords for you.
Bank account and credit card statements: all my banking is done online with e-paper statements. It saves time and paper. At the end of the year, most banks offer a Year End Summary, which you should download and store as a PDF. Most banks store these statements for up to seven years, but why not keep the copy locally? It makes keeping financial records much easier.
Bookmarks: Pinboard is an online bookmark service I use, which offers an export here. If you just use Firefox or Chrome bookmarks and have hundreds or thousands of bookmarks, back them up!
Social media services: I used to find value in Twitter, Facebook, LinkedIn, but lately, I don’t think the value is worth longterm backing up. JWZ argues for backing these services up and offers some helpful exporters. Facebook offers an export under Settings and Twitter does also.

Physical Things

Another project I started last year was backing up my childhood photo albums and VHS tapes! Well, I called my mom and asked her to mail them to me, but it’s a first step. How many boxes of tapes or old photographs do you have at your parent’s house? Convert them to a digital format today, before they fall apart.

I’ve looked at local video conversion places and have been quoted $25/tape for VHS to digital conversions, which seems pricey. I think I will wait to find a VCR and manually convert the videos myself with an Elgato USB recorder, which seems to offer a good value/quality trade off.

That’s my recommendation for the year. Everything that is physical fades eventually and everything that is digital can instantly disappear if you aren’t careful. Understand the digital world and do what is necessary to preserve your important memories and documents!

Practical Lessons Learned From Testing

2013-12-02T21:21:00-08:00

After recently joining a much larger company and taking a look at my team’s product from different perspectives, I’ve found new value from testing. While running a small startup exploring a business market, trying new features was a daily or weekly affair; you have to make tradeoffs with code quality, feature set, or speed of delivery. Code testing, especially in quantity of unit tests, had been something I usually traded for speed of delivery. If I can deliver a feature that looks correct from a high-level business perspective, then the code is probably correct based upon my experience with the programming language. However, no programmer is perfect and preparing for failure is a good idea.

Basic Testing

Testing can be approached from many different ways, with different goals. At the unit test level, you are essentially testing your API:

is the API easy to use; how hard is the code to test (a great smell for bad code)
did I really build what I wanted to build
is it extensible; how many dependencies does the code have

This what many developers think tests are. I have always felt these were important from an academic standpoint, but definitely never gave them their due respect. Unit tests are a great way to explore your code and reflect whether it still makes sense on a second glance. No matter how trivial or simple the code may be, having the ability to change your mind repeatably, with verified results, is useful.

Moving to a higher level, functional or system tests are good to determine if 3rd-party libraries and your application as a whole is working correctly, and usually where I have spent much of my time from a return on time investment. Either the system works or it doesn’t and this can be traced down quickly. This is something I have begun to focus less on, due to getting better gains from improved unit testing. This is basically the debate over top-down versus bottom-up design and I think testing at both ends of the spectrum is important.

Communicating The Code

This point has been one of the bigger breakthroughs I’ve made: tests communicate the spec. When I’m building a feature, I usually iteratively write code until I say it works. I’m continually running the code and I am doing the evaluating of the output to determine that the computer is generating correct output. But moving that evaluation step into into code and automating is a big, very useful, step. This is essentially BDD; letting business people write a code spec and having to match that spec.

Once a feature has been communicated into a code spec, changing that feature later becomes a migration; not having to start from scratch to ensure all my assumptions still work with subtle changes.

As my team has gotten larger, being able to say my code does something and then have a test to prove it helps with async communication and increases the speed of integration. Having a second source of code truth keeps everyone on the same page about what the code is doing and helps smooth the merge process.

For an open source project, having public tests helps show your concern for code quality and is an easy way for knowledgeable developers to jump into your code: how do I use your code? Well, I can always check out the tests because they better work. Across many projects, documentation is pretty rare or of poor quality. Both tests and documentation are important for your project, but while documentation fades with time, test code has a very binary usefulness.

Check Yo Self Before You Wreck Yo Self

Coming from doing most of my coding in Python and dynamic languages, this concern may not be as important in static languages like C#/Java, but I think it is still important.

Tests help verify your assumptions. Returning to my original point, failure will happen and trying to plan for is a much better solution than waiting until it happens. Dynamic typing makes it faster to write code and but pushes many errors to become runtime errors. The number of times I’ve run into date/time/datetime conversion errors in Python has definitely pushed me to test more. I can assume what code is doing all day long, until I actually test it and find that one instance when the API does something you weren’t expecting. Even assuming that you really understand dates or timezones is often incorrect.

When you run into a new problem, having a test environment setup that you can easily jump into will save time and get you fixing things quicker. The more you invest in the test environment, the easier it is to solve new types of problems and quickly diagnose problems when they arise.

In Closing

How much to test, what areas to test, what type of testing to use: these questions are always up for debate. Any level of testing is good and you can probably improve. In the world of GitHub, you rarely code “alone”; someone will always read your code and making it easier for them to read and analyze is a good thing. Finally, double checking yourself is a good thing. Testing is an investment, sometimes the return may take awhile to surface, but improving your testing ability is one step to becoming a better programmer.

Streaming Small HTTP Responses with Python

2013-11-20T09:39:00-08:00

For a recent hackathon project, I wanted to setup a client/server configuration, all in Python, so that the server could run shell commands and stream the output back to the client. The client was a Raspberry Pi and the server was my laptop, which already had my real project setup.

My first thought was to do this over HTTP, with Requests for the client and Bottle for the server. I started writing some code, checked the Bottle docs for sending a streaming response, and was running with a few lines.

import bottle
import subprocess

@bottle.route('/stream')
def stream():
    proc = subprocess.Popen(
        'echo 1 && sleep 3 && echo 2 && sleep 3 && echo 3',
        shell=True,
        stdout=subprocess.PIPE,
    )
    while proc.poll() is None:
        output = proc.stdout.readline()
        yield output + "\r\n"

bottle.debug(True)
bottle.run(host='0.0.0.0', port=8000, reloader=True)

This code worked great in Chrome; console output was streamed with pauses and the connection was not dropped.

Then I started on a client in Python. Requests also supports streaming responses, just a parameter to the standard requests.get(), nothing too major.

import requests

r = requests.get('http://0.0.0.0:8000', stream=True)
for line in r.iter_lines():
    print line

The First Mistake

After running the client a few times, the output wasn’t getting streamed. Output would pop in, as if the response was fully downloading before printing. I tried increasing the sleep amount, to see if the response was too short and tweaking the python handling of output, trying to find some sort of implicit stdout buffering with export PYTHONUNBUFFERED=True. I rewrote the server in Flask, which offers the same streaming capabilities as Bottle, but encountered the same situation with the client not streaming the response. After consulting with teammates, we couldn’t see the problem and moved on to trying to stream a local SSH connection instead, which had its own host of environment problems.

The one variable I didn’t tweak in these examples was the response size, which seems obviously when looking back now. My first mistake was not testing boundary cases: all the minimum viable tests were extremely small and not extremely different. I moved on too quickly without diving deep enough into the problem: the docs and samples were all so simple, so I assumed nothing could be wrong with the libraries I’m using.

The Second Mistake

This hackathon was all for fun, so I gave up and moved onto the next problem, but I returned the next day and dived deeper. My second mistake was trusting the docs: documentation is a great resource, but code never lies.

After looking at the function declaration for the response.iter_lines(), it quickly made sense: the default chunk size for the lines in the response was 512 bytes, which 1 2 3 would never get chunked into multiple pieces. I was also not sending \r\n, the standard HTTP chunked terminator.

ITER_CHUNK_SIZE = 512
def iter_lines(self, chunk_size=ITER_CHUNK_SIZE, decode_unicode=None):

Setting chunk_size=1 made my client immediately print output, solving all my problems.

For Next Time

To make testing easier, I’ve created a Github repo to demo all this code: https://github.com/askedrelic/streaming-demo.

While debugging this situation, I remembered to try HTTPie, a cURL replacement written in Python using Requests, which handled the streaming response correctly. Looking at HTTPie’s code for consuming responses, led to me look at Request’s code and figure everything out. Definitely recommend this tool!

Lastly, always check the code and don’t be afraid to dive deep.