~ ~ ~
Hi, I'm Pierce Freeman. I'm a ML engineer and founder.
Get in touch: pierce@freeman.vc
~ ~ ~

Mountaineer v0.1: Webapps in Python and React

February 27, 2024

Today I'm really excited to open source a beta of Mountaineer, an integrated framework to quickly build webapps in Python and React. It's initial goals are quite humble: make it really pleasurable to design systems with these two languages. Mountaineer provides the following key features: First-class typehints for the frontend, backend, and database. Auto-suggest all data and function signatures for ease of development in your IDE. Trivially simple client<->server data binding, function calling, and error handling. Server rendering of frontend components, using a bundled V8 engine. Static analysis of frontend code for strong validation of types and links to other pages. PostCSS support for Tailwind, CSS Polyfills, etc. It doesn't shoot for anything too custom. You don't write Python code to be compiled into React components. It doesn't spin up a new intermediary server to route requests. It's just vanilla Python and vanilla React. The framework focuses on helping them both play to th

Constraining LLM Outputs

February 20, 2024

LLMs are by definition probabilistic; for each new input, they sample from a new distribution. Even the best prompt or finetuning will minimize (but not fully resolve) the chance that they give you output you don't expect. This is unlike a traditional application API, where the surface area is known and the fields have a guaranteed structure. To use LLMs in any kind of downstream pipeline, you need to get them closer to this API world. You need to be able to enforce a standard API contract of the data you want. Requesting JSON output is the common way to do this - but even with the best prompts, often at least 5-10% of outputs are subtly invalid (missing commas, extra quotes, etc). Constrained generation is a way to constrain your model to only produce valid JSON. Technically - you can force any model to give you valid JSON. But the quality of this JSON will vary depending on model and configuration. Let's look into whether finetuning a model to produce JSON is preferable to forcing JS

Passthrough above all

February 15, 2024

Being in VR in the Vision Pro is a composite of two different realities. One is the spacial background (either your actual surroundings or the simulated environments). The other is the window context that shows all your apps, the menu bar, and system functionality. Windows are linked to a particular location in space. If you put one in a room, walk out of the room, and walk back, they'll be exactly where you put them. I haven't tried it, but my hunch is if you leave the country and come back, they'll also still be in the same place. The mapping of physical space is incredibly impressive. As a result of being able to position these windows however you like, and view them from any angle, there's sometimes a conflict between the window's existence and your own passthrough reality. Try to place one in a room and then walk through a doorway, peeking back at the room from within the door frame. The window will typically remain un-rendered, even if you can see a partial location of where it s

How quick we are to adapt

February 6, 2024

What a completely unordinary scene it is to see cars without humans driving these days. After Cruise was suspended, Waymo has the autonomous streets of San Francisco all to themselves. They're accompanied by a distinctive electric whir when they come down the block. This noise is no doubt a function of the Jaguar base model they use, not the Waymo design specifically, but the sound has become indelibly paired with the lidar array spinning on-top. The second you hear it, just glance outside to see one driving up the hill. Most of the people I know in San Francisco have used a Waymo at least once. Many friends of mine swear by them.1 The fact they're self driving doesn't really enter into the equation: they just prefer the product they're being offered when they're picked up. But I don't think that's how it starts. Most people are trepidatious at the prospect of getting into a car without a driver. They try it out in the pursuit of something new or the urging of a friend. SF is inherentl

The curious case of LM repetition

January 22, 2024

I was doing some OSS benchmarking over the weekend and was running into an odd issue. Some families of models would respond with near-gibberish, even with straightforward prompt inputs. Given the low reported perplexity, I thought there had to be an issue. I refactored the core logic into a new notebook to isolate the extraneous factors. Here's the basic code: Couldn't be simpler, right? Load the model, tokenize the input, and generate with some pretty common sampling defaults. Still saw the same issue. This is the raw output (all warning messages included): I like cookies. But not this much. Here were my theories and what I checked along the way. Finetuned model format Especially in RLHF finetuned models, training inputs will be normalized to a common prompt format. This helps to guarantee a consistency between "user" text and expected output, while allowing users to specify different prompting techniques at the system level. If you've used hosted LLMs (OpenAI, Anthropic, etc.) thi

Debugging chrome extensions with system-level logging

December 19, 2023

I've been working on a Chrome extension lately that's getting closer to a public release. That shifts my workflow from blue sky design to the basement - fighting every last bug. Extensions are basically mini web applications these days, just with access to a global variable that can interact with some browser-level functionality. Aside from that - it's all familiar. That extends to the debugging experience. Since extensions run in the regular V8 Chrome runtime, Chrome exposes the same debugging tools that you're used to on the web: profiling, stack tracing, code mapping, etc. Unlike a regular website, however, the potential edge cases of an extension are practically infinite. They need to tolerate the whole universe of pages where you're applying them. I've found one of the best ways to catch these edge cases and diagnose them after the fact is to capture verbose logging to disk. You can browse the web as you test your extension and then review the workflow session logs afterward. The

Speeding up runpod

December 18, 2023

Runpod.io is my favorite GPU provider right now for smaller experiments. They have pretty consistent availability of 4x/8x configurations with A100 80GB GPUs alongside some of the current generation nvidia chips. One issue I've observed a few times now is varying runtime performance box-to-box. My working mental model of VMs is that you have full control of your allocation; if you've been granted 4 CPUs you get the ability to push 4 CPUs to the brink of capacity. Of course, the reality is a bit more murky depending on your underlying kernel and virtual machine manager, but usually this simple model works out fine. On Runpod since any configuration less than requesting the full 8GPUs is multi-tenant, you might be competing with other workloads. A few times now I've observed sluggish performance on the box (batch preprocessing slow to complete, bash commands slow to enter, etc.) An htop readout that you want to see at bootup. I might even have the box to myself. My default when starting

Inline footnotes with html templates

December 17, 2023

I couldn’t write without footnotes. Or at least - I couldn't write enjoyably without them. They let you sneak in anecdotes, additional context, and maybe even a joke or two. They're the love of my writing life. For that reason, I wanted to get them closer to the content itself. By default Markdown and Markdown parsers will render footnotes at the end of the article. For instance: My goal was to instead have them look more like: Okay. Where's the tech design here? Footnotes should display next to their reference tags. They should resize and change position dynamically if the text wraps or box model changes. If the window is too small, they should remain at the end. The end product looks like this. If they don't show up inline, you might need to resize your browser.1 The only way to accomplish this is with a bit of Javascript.2 Since this site uses vanilla html and Javascript only, we can add a small to the page that accomplishes exactly that. First we style our post contents to ha

Parsing Common Crawl in a day for $60

December 14, 2023

Random notes and architectural choices to parse a full common crawl HTML dump in about a day, for $60. Common Crawl is the largest permissively licensed archive of the Internet. They conduct their crawls about 5 times a year and index about 3.5 billion pages every crawl. In addition to forming a bulk of the foundation of modern language models, there's a ton of other data buried within Common Crawl. Incoming and external links to websites, referral codes, leaked data - if it's public on the Internet, there's a good chance CC has it somewhere within its index. So much critical data is just sitting there, buried somewhere in the html markup. 1 It's a dataset that I keep coming back to. And I've never been particularly happy with my method of querying it. Back when my previous jobs had a big Spark cluster, I'd typically use that - or sometimes Hadoop or sometimes even Athena. But inevitably managing these systems becomes a headache, prone to errors or crashes halfway through. Not to ment

Adding wheels to flash-attention

August 20, 2023

flash-attention is a low level implementation of exact attention. Unlike torch, which processes attention multiplications in sequence, combines the operations into a fused kernel, which can speed up execution by 85%. And since attention is such a core primitive of most modern language models, it makes for much faster training and inference across the board. It now has an install time that's just as fast. Tri Dao (the main package author) and I recently added precompiled binaries to the python package. I'll say upfront: this particular implementation is a rather unorthodox use of wheels. Standard wheels only can depend on operating system and python version; flash-attention requires CUDA and torch versions as well. So it naturally required a bit of off-roading. This is a breakdown of the approach we took. What's in a wheel Many optimized libraries contain C or C siblings, which must be built at some point before Python can execute it at runtime. Python's had support for these for a lon

LLMs as interdisciplinary agents

May 26, 2023

The most compelling bearish case I see for LLMs is that they'll plateau in performance, either because we'll saturate novel data from large scale crawling or because of some fundamental logical reasoning limitation of existing transformer architectures. "Hooray," the critics will cheer. We can go back to humans being in the driver's seat. Skynet will have to wait until another day. That last part might be true. But even a bearish case for performance can still change a lot. Let's make three conservative assumptions: LLMs will be trained in largely the same way as they are today LLMs will only have knowledge of public information through large-scale text corpora Experts (either individually or a consortium) will still perform better than LLMs for professional grade tasks In this world the real breakthrough with large language models might not be exceeding human levels of performance in a discrete task. Perhaps it's enough that they can approach human level performance in a variety of

Representations in autoregressive models

May 11, 2023

I took a computer vision course in college with a rather provocative professor. One of the more memorable lectures opened with a declaration: representations in machine learning are everything1. With a good enough representation, everything is basically a linear classifier. Focus on the representation, not on the network. When we were simultaneously training neural networks with millions of parameters, I thought it a rather insane thing to say. Technically he's right - of course. If you can shove most of the difficult work of taking an input and projecting it into a numerical representation (and that numerical representation is conditioned on the task that you want to accomplish), you by definition have turned your problem into a separable one. Once you've solved your problem, you've basically... solved your problem. Technically speaking the representations can collapse to one common set of representations for True and one common set of representations for False. It's interesting to th

Let's talk about Siri

April 28, 2023

I was considering a quick weekend project to route Siri requests to ChatGPT. My immediate thought was pretty simple based on my existing mental model for Siri as a local<->server pipeline: Set up a MITM proxy or custom DNS server Intercept outgoing Siri requests and route them to a backend LLM Respond with a correctly formatted payload, potentially with some additional utilization of the widgets that are bundled in iOS for weather or calculated responses That required me seriously poking around with Siri for the first time in a couple years. A lot has changed since I last took a look. Everything's Local Since iOS 15, all Siri processing is done locally on device. There's a local speech-to-text model, a local natural-language-understanding module, and a local text-to-speech model. The local NLU module surprised me the most. All logic appears hard-coded and baked into the current iOS version. It'll respond to known tasks like the weather, setting a timer, converting weight, looking up

Minimum viable public infrastructure

April 27, 2023

Minimum Viable Products (MVPs) are popular in startups because they allow testing of underlying risks without needing to build a full solution. In hardware companies, MVPs demonstrate that a solution can be physically built given specification constraints. In software companies, they prove that a problem is legitimate and there is consumer interest in the solution. Once you get the initial signal, iterate. Move fast and break things. The other clichés also apply. Public infrastructure obviously isn't treated the same way. You can't iterate when you're building huge things. You also can't tolerate failure in the same way. You don't want a bridge constructed in a month only to fall down the year after. The bulk of the bureaucracy for infrastructure is making sure projects meet this bar of safety; safe to use, safe to be around, and safe for the environment. But we do have minimum viable infrastructure: situations where we have some basic infrastructure but it's simply not good. You can c

Reasoning vs. Memorization in LLMs

April 13, 2023

The parameters of LLMs today have to jointly store facts as well as general reasoning logic. When you ask the model to reason about something that it hasn't seen before, I consider that general reasoning. This could be some word based predicate logic or otherwise executing rote tasks where experts commonly agree upon the end result. I've been pleasantly surprised at how well modern LLMs can reason - although it's certainly not infallable. For instance: > If a hippo is the only creature in the world to have milk, and Shawn is currently drinking milk, where did it come from? If a hippo is the only creature in the world to have milk, and Shawn is currently drinking milk, then it must have come from a hippo. However, it's important to note that this hypothetical situation is not reflective of reality, as many other mammals produce milk as well. Of course, I assume hippo milk wasn't in the training data. But you can also ask the LLM about real world situations, much of which are at least

Automatically migrate enums in alembic

March 30, 2023

I don't know if people have come up with a good acronym for Python services that compete with MERN or LAMP, but if they have then SQLAlchemy and Alembic are almost certainly included. SQLAlchemy (recently in version 2.0) makes it easy to define ORM schemas for database objects and Alembic keeps everything updated with automatically generated migration files. If you're using this stack then you probably know the pain that code enums introduce. Declaring an enum requirement in a model is pretty straightforward: And Alembic will even pick up on the new enum creation: So far, so good. Unfortunately when you actually change this enum (as you know does happen) you're out of luck. Alembic ignores this enum value change even when it's outdated from the current database value. So this change: Creates no diff: And will result in a database error if you actually try to use it. Spoiler alert: We probably want to use it. I stumbled upon alembic-autogenerate-enums, which is a neat approach to s

Greater sequence lengths will set us free

March 20, 2023

GPT-4 was announced this past week. Some key takeaways from their paper: Greater performance on human tests (Uniform Bar Exam, LSAT, AP Calculus BC) in addition to ML benchmarks, showing greater logical reasoning ability and the capability to synthesize information across multiple academic domains Introduction of multimodal input where images can be included alongside text, and where text prompts can reference the content of the images themselves Greater sequence lengths available, with models for 8K tokens and 32K tokens compared to GPT's current 4k Improvements in (1) and (2) speak for themselves. But personally I'm far more excited about the trends we're seeing in (3). Historical Attention Limitations The explosion of transformer architectures (BERT all the way to the GPT family) was facilitated through attention. Attention allowed for far richer modeling of long-term text dependencies, since the end of a sequence could directly reference the start of a sequence with no loss of re

On learning to ski

March 8, 2023

In Reinforcement Learning there's a core tradeoff between exploration vs. exploitation. In a limited-time game, you have to choose how much you pursue novel paths and how much you stick with what you know. The latter becomes that much harder once you know that a policy mostly works. When do you decide to stop optimizing and call perfection the enemy of good? Common wisdom says children explore while adults exploit. Children are willing to try new things in their spare time. Adults fall back on old hobbies and accept their limitations. At some point, we tend to transition from one to the other - perhaps because of risk intolerance, time limitations, or sheer laziness. There's a body of psychology research into cultivating growth mindsets and how certain behaviors can help or hurt learning. At their core, the research shows that mindset is more important than neural plasticity. I'm sure that's right. But I feel like the hesitation to learn new things can be summed up more directly. Namel

Using grpc with node and typescript

February 16, 2023

Documentation for using grpc within node is split between static generation and dynamic generation. Dynamic generation compiles protobuffer definition files at runtime through and typically looks like the following: Most of the grpc docs use the dynamic approach - I assume for ease of getting started. The main pro to dynamic generation is faster prototyping if the underlying schema changes, since you can hot reload the server/client. But one key downside includes not being able to typehint anything during development or compilation. For production use compiling it down to static code is a must. I've started using a pipeline that re-generates static compiled files and their typescript definitions automatically. Here's how to do it. The standard version that generates code for Go/Java/Python/etc. doesn't have support for generating javascript definition files. Where's my javascript at?? JS generation is only supported through the grpc-tools library, which includes a wrapped version o

Opportunity years

February 15, 2023

The last few months have been tough for a lot of people - young, old, it doesn't much matter. Layoffs, down rounds, and bankruptcies jolt the seemingly inexorable progression of life. Decisions that were within grasp are now no longer. Starting a family or retirement are eclipsed by job applications and culture fit interviews. Yet among people I know, they've never been more excited. A lot of people are still starting companies, raising funding, and exploring new fields. In some cases the layoffs have been an excuse to seek change. If they can't attain stability at a big company why not start their own thing?1 We're seeing one of the largest reallocations of labor in our lifetimes. Most of those affected by these large-scale layoffs are not going into unemployment. They're going to different companies or other career paths to try something new. After the monotony of the pandemic, stable growth is being replaced by risk taking. Big technology companies have their advantages. They have b

Buzzword peaks and valleys

February 14, 2023

At the turn of 2017, everyone was talking about AI. The rise of deep learning and new transformer architectures seemed ready to thrust us into an age of innovation. Every company wanted to be an AI-first company: rebranding, adding copy to their promotional pages, etc. Most didn't change their underlying tech. They had a Logistic Regression model for a particular feature and suddenly they were AI Everywhere All At Once. Many companies I researched during this time didn't have a single ML engineer or data scientist. At the turn of 2020, everyone wanted to go into Web3. A proliferation of startups looked to reinvent the core stack of financial ecosystem (instant clearance, payment, trading platforms) alongside the backbone of the Internet (peer detection, file sharing, identification). Every company newly wanted to become a Blockchain company. R&D Groups were spun up to investigate how blockchains could be slotted into existing business practices and leverage. AI took a temporary backsea

Network routing interaction on MacOS

January 2, 2023

There is a series of resolution layers governing DNS, IP, and port routing on OSX. Here are some of the different interfaces to manipulate how you route traffic to the internet or to localhost. /etc/hosts The hosts file forms a direct association between domain and IP address. It is effectively used as a higher priority routing record to a record in a DNS lookup table. Note that this file does not support port routing. Commands will be routed 1:1 from synthetic domain name to IP. Given the entry: Accessing the domain in curl or a browser will route accordingly: ifconfig Allows you to analyze and manipulate the different networking interfaces on your computer. To view all of the available interfaces: It also allows the creation of new synthetic IP values, depending on the support of the networking interface. So instead of having localhost be you can also create within the loopback interface. pfctl Mac replacement to with a similar command structure. This utility focuses on filte

The provenance of copy and paste

December 19, 2022

Much of the American corporate world runs off of manual workflows. Excel, Email, Slack, and CRMs are still the workhorses of getting things done. They also share some core similarities, namely that they're tools prized for their flexibility. They are the hammers that make everything a nail. I've seen a fair share of complex workflows across the above applications. Many approximate the functionality of a full fledged computation graph. But instead of nodes and edges, they have people running hard-to-automate processes. These workflows are usually seen as pretty brittle and for good reason. The common engineering concern is on the application layer: missing schemas and the lack of value typechecking that can lead to unexpected issues downstream. A topic that receives less attention, however, is the provenance of data that flows into and out of these different tools. I often find myself going through documents that I've written or were written by colleagues. I almost inevitably have to w

Debugging tips for neural network training

December 16, 2022

Andrej Karpathy wrote a great post a few years ago about training neural networks. Here are a few additional things that I follow during implementation, with a bias towards debugging large language models. Log anything and everything Weights Set up logging upfront as extensively as possible. I use for experiment reporting. I find it the best option on the market right now for experiment tracking with near unlimited personal use. Log tensors of the training process, especially the updates in layer weights. Look out for gradients that go to zero and stay there. Sometimes this is just a biproduct of the current loss landscapes but it's often a signal that the network has saturated the learning it can actually do. Specifically, Log gradient magnitude of each layer Log gradient distribution of each layer Log weight matrix norm Log weight matrix distribution Both of the distribution logs are handled through wandb’s built-in watch utility. Matrix norm can be easily done by iterating over

AWS vs GCP - GPU Availability V2

November 14, 2022

The dreaded lack of availability screen that motivated my first experiment. When I compared AWS vs GCP GPUs in September it ended up generating quite a conversation on HN and Github. To briefly recap the context there: I needed dynamic GPU allocation for spiky data processing jobs with near-realtime user latency guarantees Availability (ie. will a GPU spawn) and cold-start launch (ie. how long it takes) are the two most pivotal KPIs Attempt to quantify how often GPUs are fully unavailable, since I ran into some anecdotal instances where GPUs wouldn't boot for 30mins on end After publishing that post I ended up working with an assortment of Google engineers to see what was going on. Here are some learnings and a re-run of the previous experiment. For completeness I'll cover the full list in case you are struggling with some unexpected GPU allocation issues in production. 409 Conflicts The lion's share of unresolvable errors during the first trial were 409 errors. These indicated that

Independent work: October recap

November 5, 2022

It's been a month since going full time on my own thing. In some ways I'm surprised by how natural the transition has been. My morning standup meeting has morphed into a journaling session. My 1:1s have migrated to Discord chats about open source projects. Quarterly planning and tech designs have been replaced with my IDE. I was expecting many more moments of friction with the new lifestyle and a constant question of "What in the world are you doing?" They never really materialized. As it is - there's just the right amount of boredom and business. I have a growing todo list but the things on it really excite me. It's been awhile since I've had the feeling where everything seems new. Lifestyle My days structurally look similar to how I was spending them at my last job. Just way fewer Zoom calls. I start every day with a cup of coffee, spend ten minutes writing in my journal, and then fire up my laptop. The rest of the day until evenings is spent at the keyboard, punctuated by a few walk

Relationship modeling

October 25, 2022

Given the pandemic's isolation of friends and friend groups, I've been thinking a lot about relationships. Which ones fulfill, which ones entertain, and which ones are resilient to strain. One axis that seems to track relationship maturity is conversation temporality. In a new relationship, you're mostly focused on the present or recent past. You're playing mini-golf. You're sharing coffee. You're telling stories. You're enjoying the same moments together. As relationships mature, you focus more on the past and future. How did you grow up? What are your fears and dreams? Why do you have the same wallet that your grandfather gave you years ago? You're theorizing about the future and grounding it in the past. Why do close relationships devote such a significant portion to moments beyond the horizon? Why do we care so much about the past? One answer is that we require intimacy before we even broach these subjects. Deep conversations seem vulnerable and vulnerability is more comfortable wi

The power of status updates

October 19, 2022

When I was leaving Paris by train out of Gare du Nord, we showed up twenty minutes before our scheduled departure. Typically that's more than enough time to grab a bite, find your cabin, and settle in before you roll out of the station. We checked the schedule board and saw our train wasn't yet assigned a platform. No problem. We got some food and came back around. Still no official gate. As the clock ticked closer to departure, people started frantically running around. "Do you know which gate is ours?" "Is that our train?" With every minute that went by a larger crowd assembled around the status board. When one person caught wind of a rumor, there was a manic movement of people to a new platform to try and board that train. Everyone was turning to their neighbors and asking if they had any idea what is going on. We had pieced together which track our train was on and that it was delayed by an unknown period. We told one group what we knew. Then another. Before we knew it we were perf

A new chapter

October 13, 2022

Last week I said goodbye to my colleagues at Globality after five years on their engineering team. It's hard to believe it's been so long. I still remember my first day perfectly - no laptop, no desk, not even a manager to greet me. I ended up writing my first PR on a personal computer in the kitchenette. 1 All farewells are bittersweet - but this one particularly feels like a chapter ending and another opening. I watched us grow from no clients to serving many of the Fortune 50. We went from a literal garage to offices in Palo Alto, London, and Tel Aviv. And for me personally, we went from no machine learning platform to a full ecosystem of models. We solved the core Internet scale problems that I had set out to do. It felt like a natural inflection point. That meant it's time for something new. I've been putting a lot of thought into what I want to focus on next. Here's my current list. Building I tried to carve out maximum time in front of the keyboard over the last few years. My ph

Give my library a coffee shop

September 28, 2022

Borders Books was one of my favorite stops as a kid. I could spend hours walking through the different isles while browsing for a novel. And I wasn't the only one. The stores were thriving with people. Some were there to quickly buy a paperback. Others took a more meandering pace, sampling a book over a coffee in the attached coffee shop. I was far too young for a latte but still remember the smell. Drip coffee mingled with the light musk of paperbacks on a crisp fall's day. My Borders closed a long time ago. I think that store is a Crate and Barrel these days. Over the last fifteen years online ordering and eink tablets have taken a big chunk out of independent booksellers across the states. I don't take issue with that directly. Getting books online has many advantages - they're usually cheaper, more convenient, and let people self publish in a way that was inaccessible even ten years ago 1. But it seems clear that the target demographic of physical book stores is becoming reserved o

AWS vs GCP - GPU Availability V1

September 21, 2022

There's an updated (and more accurate) comparison here: AWS vs GCP - GPU Availability V2 Cloud compute is usually seen as an ethereal resource. You launch VMs and spin them down, billed to the second. The billing and the mental model make it seem like these resources are limitless. That's typically one of the selling points versus on-prem compute. They can scale responsively to your load so you're not paying for excess compute that you don't need but it's there when you want it. Of course, in reality they're not limitless. Cloud compute is backed by physical servers. And with the chip shortage of CPUs and GPUs those resources are more limited than ever. This is particularly true for GPUs, which are uniquely squeezed by COVID shutdowns, POW mining, and growing deep learning models. This can lead to resource availability issues when you need to spin up boxes on-demand, like for training and heavy inference load. And resource availability constraints mean you can't count on them being aro

Headfull browsers beat headless

September 7, 2022

Twenty years ago a simple would open up the world. HTML markup was largely hand designed so and attributes were easily interpretable and parsable. Now most sites render dynamic content or use template defined class tags to define the styling of the page. To handle this richness of rendering, most production crawlers use headless browsers - at least for a part of the pipeline. Since you're running a chromium or webkit build, these should render sites exactly how users see them. In reality, headless browsers are sometimes quite different: The headless chromium build is a different executable - there are some codepaths that are only available in the full build or have different behavior in headless mode. Extensions are one example; Chrome supports them but headless does not. There are some Javascript APIs that are missing from the headless implementations, mostly with regard to viewport or other screen features. There are canvas elements that render differently - both because of missi

Webcrawling tradeoffs

September 6, 2022

A couple of years ago I built our internal crawling platform at Globality, which needed to be capable of scaling to two billion pages each crawl. We had to consider some early design tradeoffs that influenced the rest of the architecture. The most fundamental was which rendering engine we wanted to adopt - and like everything in systems design, they each had different trade offs. The two main types of crawlers that are deployed in the wild are typically raw or headless: Raw HTTP Request raw html over the wire. Bind to the host's socket and issue a GET for the page of interest, then continue to discovered links in a BFS search. Pros Fast. Only downloads the html text payload (still measured in kb on the largest sites). Trivially parallelizable through async processing or threading in your language of choice. An average server can usually accommodate tens of thousand requests in parallel. Cons SPAs are broken, often don't render or don't wrap their links with tags. Additional JS-popu

Busses can fool me thrice

August 30, 2022

Public transit is often framed as necessary philanthropy for cities. It cuts down on cars and pollution at the expense of convenience. If people can more efficiently get to their destination by other means, they will. This is the wrong way to look at things. Public transit at its best can be additive for individuals, not just the aggregate good. You don't have to worry about parking, you don't have to worry about traffic, and you know exactly how long it's going to take to get from Place A to Place B. London, Paris, and New York have this figured out - everyone takes the subway, even if they have other options. There is broad support for public transit because it fits into the everyday fabric of city life. But for public transit to really work, it needs trust. I was waiting for a Muni early in the spring when I first moved to San Francisco. It was a beautiful day; sun was streaming and traffic was lazy. Green routes all across the city according to Google Maps. The bus schedule said to

Falling for Kubernetes

August 7, 2022

I've considered myself a strong kubernetes skeptic in the past. Bare metal is always my first choice both for projects and startups. That includes the stack that runs this blog. Calling it a stack might even be an exaggeration. It's a CI toolchain with an nginx configuration on the host. But it does its job, can handle surprising concurrent load, and is cheap to host. It costs just $10 a month and can likely go side by side with corporate blogging platforms that cost two orders of magnitude more to host. Premature optimization might be the root of all evil but so is premature scale. I'm convinced that companies over complicate their architecture prematurely, which leads to headaches for engineers and instability for users. A monorepo with a simple server should be the default place to start. Run a basic docker instance to minimize dependency hell and make sure your remote configuration is reproducible on your local development machines. Run a few daemons either in Docker or with cron s

Content that I'm obsessed with

August 2, 2022

A constantly updating collection of content that I highly recommend to others. Movies Everything Everywhere all at Once - at moments you're not sure exactly what you're watching, in the best way possible. Stunning visual effects, sharp dialogue, and a modern hero's journey. All playing out in the setting of mother reevaluating her relationship with her daughter and considering other choices she could have made along the way. Charade - a time capsule for an older Hollywood. Fast banter with twists and turns that keep you guessing for two hours. My introduction to Cary Grant and Audrey Hepburn, which shocked my parents but it's better late than never. TV White Lotus (HBO) - episodic exploration of racial tension, power imbalance, and a manic supervisor. All backed by beautiful hawaiian views and the sound of waves. In the end, Quinn was the only one that really had a holiday. Prehistoric Planet (Apple TV+) - incredibly photorealistic rendering of dinosaurs. Cites broad research on lifest

Remote work is a better tourism

June 9, 2022

I've worked remotely from Kona (Hawaii), London (United Kingdom), Marseille (France), and Tahoe (California) over the last few years. I've been to most of those places before as part of vacations. In almost all cases, I've vastly preferred working there. It gives you the encouragement to do what locals do. Go to coffee shops. Exercise in the evenings. Find some cheap grocery stores. Cook a meal and invite neighbors. Get out of the city and enjoy a hike. You're way more likely to meet people who live there if you engage them where they're most likely to be doing work and living lives themselves. In Hawaii I learned how to surf right outside of town. It was a small place with a constant hum of tourist foot traffic. Since conditions changed so quickly that winter, I called the shop almost every day to ask for conditions. It turns out that most of their clientele is transient and they only have a few people that surf there permanently. Just showing up week in and week out was enough to ge

The new opportunity in travel

May 29, 2022

Airbnb announced their new office policy last month. In a move that surprised no one for a global travel company, they are fully embracing the pitch of remote work. It's a long and well considered piece so I recommend reading it in its entirety. But these few sentences didn't get enough attention. If you move [within one country], your compensation won’t change... Permanent international moves are much more complex, so we won’t be able to support those this year. Starting in September, you can live and work in over 170 countries for up to 90 days a year in each location. International moves are still discouraged by policy, both by compensation adjustments and by logistic overhead. But living in another country for up to three months is allowed - meaning employees can temporarily work around the world. If strung together, these shorter stints can sum to constant international travel with the same pay. At Globality, we're rolling out a similar policy for around a month at a time. We ha

Labor markets calibrate satisfaction

May 10, 2022

It's clear that different career paths have vastly different earning potentials. What explains the discrepancy? Talent is a combination of intelligence, grit, and access. Before talented people start to specialize after high school, I firmly believe they can do almost anything. Their decision of what they will do reflects some internal prioritization of interest and priority systems. But every industry wants these talented individuals. So what explains the differing pay? Something is clearly lost in our typical conversation about what a salary includes. Traditionally: Tech companies have some tradeoff between these two factors. Startups offer more equity in return for lower salary, with the promise of more upside. Established players offer less equity in return for higher salary and more stable returns. Across all of technology - regardless of firm - you'll notice that talented employees command the same rough total compensation. The market is clearly somewhat optimal in setting this

Installing FastText on an M1 Mac

May 5, 2022

We rely on FastText in some of our NLP microservices. Since upgrading to an M1 Macbook, these dependencies have failed to build wheels. After some digging, the default C headers that are shipping with XCode are causing a definition conflict. Removing them fixes the problem. We move these to a temporary folder so we can restore them later, since they are functional for most of the other clang system installs. And we have a successful install:

Architecting a blog

January 4, 2022

One of my goals for the new year was publishing more frequently. I had a few qualifications: Remove as many barriers to composition as possible. Styling should be set it & forget it. No customization required per article. I should be able to take notes on my mobile and convert these to articles over time. The difference between a draft note and a published piece should be as small as possible, aside from its inherent fidelity. The stack should be the absence of a stack; there's no need to host a kubernetes cluster for a personal blog. Writing Experience I started with my ideal experience when writing pieces, since this is where I'm going to be spending the most time. The technical backend can be a bit more complicated if it means allowing me to be most productive at the keyboard. I wanted to dedicate a few weeks tweaking the workflow so it became as habitual as possible, before investing in the supportive tooling. I converged on a completely file-based workflow. Markdown was the nat

Write where you are

January 3, 2022

A good friend of mine recently shared her resolutions for the new year. Heading the list was writing a sentence every day. Rain or shine, writers block or not. She's a writer by hobby but not profession, so often personal writing is thrown to the side in favor of work or other priorities. The goal was to form a habit - and the best way to habituate a new ritual is to power through even on days when you're not feeling it. I've maintained a growing collection of notes on my laptop over the past few years. Book reviews are sprinkled alongside research themes, which sit next to thoughts and learnings on management. I rarely look back at these notes - and even rarer fully finish them. The meandering thoughts form more of a scratchpad for ideas. But some have the sketches of a full piece. They just never see the light of day. Publishing has always been my bottleneck. During stints on Wordpress or Medium, I was overly focused on how articles looked that it often got in the way of what they sa

Treat engineers as users

December 30, 2021

Engineers are happiest when they're starting a new project. Certainly some excitement comes from pure novelty but some is deeper. You have a blank slate to build. You can take a step back and consider new languages or libraries. Compilation feels fast and expressive. You change one line in your IDE and watch your browser refresh in realtime. Once that project matures, something changes from those early days. Your build times grow into the awkward twenty second range, just long enough to answer a few messages on slack but probably not long enough to refactor it with a faster runtime. You fiddle with libraries and microservices in different repositories, putting up a handful of PRs when you're trying to ship a single new feature. Your unit tests grows into the thousands, with little organization or narrative structure that guides you through what has already been written. All these things add up. And they add up to your stack being a pain to use in practice. Building is no longer fun. It

Scoping an ML feature

April 26, 2021

This article came out of a presentation that I gave to our Product organization around defining and building ML features. I try to unpack the questions that a product team should consider as they’re looking to understand where machine learning can supercharge their user journey and where it might fall flat. Business leadership, product managers, and researchers too often speak different languages when talking about machine learning. I’ve been lucky enough to spend the last four years at the intersection of product development and machine learning. I’ve run both product teams and machine learning research groups. I’ve also advised business leadership on corporate strategy around adopting useful machine learning systems. I’ve seen these different perspectives firsthand while helping to bridge the worlds. Most confusion when building ML features comes at the beginning of a project. The goals are vague, the data isn’t in the expected format, or the metrics are ill-defined. This is a key pl

AI needs a better definition

April 14, 2021

Have you tried to define Artificial Intelligence lately? Go ahead and find a common thread linking the tens of thousands of marketing websites that scream it from the rooftops. I’ll wait. People label AI as anything and everything these days. You have search systems, you have process automation, you have spam filters. If motion activated supermarket doors were invented today, I guarantee they’d be branded AI too. The academic community certainly doesn’t help provide a clear definition. AI encompasses a broad range of research pursuits. It covers search systems, computer vision, chatbots, natural language processing, robotics, and game playing. It’s typically a department or subdepartment at most universities, like you would find System Design or Information Theory. When’s the last time you heard your manager ask to apply “systems design” to solve a problem and then they left it at that? Then again, academia has never been particularly interested in domain clarity. But as Artificial Int

NFTs are nothing new

April 5, 2021

NFT’s have exploded into mainstream conversation over the last few weeks. Like with everything in crypto, you have strong bulls and equally strong bears on the investment thesis. It’s an ephemeral good, which defies many of our existing assumptions about object ownership. It’s a novel investment class that’s only made possible by the blockchain and distributed ledgers. But really — are the principles behind NFTs anything new? And what can collector culture tell us about the investment opportunities with these new tokens? Here I sketch out a framework to assess NFTs and some market opportunities to create value in the ecosystem around them. A Brief Recap: What’s an NFT? If you have a basic knowledge of how cryptocurrencies are traded, you’ll be able to easily understand how an NFT is minted and exchanged. By way of recap: let’s say that you’re looking to buy 1 Bitcoin for 10 Dollars. There is a token (Bitcoin), a seller (or source wallet), and a buyer (or destination wallet). A transact