![]() ![]() Heron also had just come out while we were starting to migrate things, and the community momentum and direction of Kafka felt more substantial than the older Storm. Heron looks great, but we already had a programming model across services that was more akin to consuming a message consumers than required a topology of bolts, etc. We originally looked into Storm / Heron, and we'd moved on from Redis pub/sub. Since then, the Confluent Platform community has grown and grown we've gone from doing most development using custom Scala consumers and producers to being 60/40 Kafka Streams/Connects. Understanding the internals and proper levers takes some commitment, but it's taken very little maintenance once configured. We pored over Kyle Kingsbury's Jepsen post ( ), as well as Jay Kreps' follow-up ( ), talked at length with Confluent folks and community members, and still wound up running parallel systems for quite a long time, but ultimately, we've been very, very happy. We ultimately migrated to Kafka in early- to mid-2016, citing both industry trends in companies we'd talked to with similar durability and throughput needs, the extremely strong documentation and community. Both supported decent throughput and latency, but they lacked some major features supported by existing open-source alternatives: replaying existing messages (also lacking in most message queue-based solutions), scaling out many different readers for the same stream, the ability to leverage existing solutions for reading and writing, and possibly most importantly: the ability to hire someone externally who already had expertise. For most of the company's history, our analysis of user behavior and training data has been powered by an event stream-first a simple Node.js pub/sub app, then a heavyweight Ruby app with stronger durability. Lumosity is home to the world's largest cognitive training database, a responsibility we take seriously. But if you're like me, yeah, give things another shot- I'm somehow not hating on JavaScript anymore and. Obviously there's a lot of things happening here, so just saying "JavaScript isn't terrible" might encompass a huge amount of libraries and frameworks. Users have way more comprehensive expectations than they did even five years ago, and the JS community does a good job at building tools and tech that tackle the problems of making heavy, complicated UI and frontend work. But it's that silly rich apps phrase that's the problem. I still love Rails, and still use it for a lot of apps I build. And I deeply apologize for using the phrase rich apps I don't think I've ever said such Enterprisey words before.īut yeah, things are different now. I'm primarily web-oriented, and using React and Apollo together the past few years really opened my eyes to building rich apps. It's always been a nightmare to deal with all of the aspects of that silly language.īut wowza, things have changed. I have truly hated JavaScript for a long time. We then integrate those deployments into a service mesh, which allows us to A/B test various implementations in our product. This provides our data scientist a one-click method of getting from their algorithms to production. PyTorch, sklearn), by automatically packaging them as Docker containers and deploying to Amazon ECS. Khan provides our data scientists the ability to quickly productionize those models they've developed with open source frameworks in Python 3 (e.g. Models produced on Flotilla are packaged for deployment in production using Khan, another framework we've developed internally. That requires serving layer that is robust, agile, flexible, and allows for self-service. We have dozens of data products actively integrated systems. The execution of batch jobs on top of ECS is managed by Flotilla, a service we built in house and open sourced (see ).Īt Stitch Fix, algorithmic integrations are pervasive across the business. model training and execution) run in a similarly elastic environment as containers running Python and R code on Amazon EC2 Container Service clusters. While the bulk of our compute infrastructure is dedicated to algorithmic processing, we also implemented Presto for adhoc queries and dashboards.īeyond data movement and ETL, most #ML centric jobs (e.g. We have several semi-permanent, autoscaling Yarn clusters running to serve our data processing needs. Because our storage layer (s3) is decoupled from our processing layer, we are able to scale our compute environment very elastically. ![]() Apache Spark on Yarn is our tool of choice for data movement and #ETL. ![]() ![]() We store data in an Amazon S3 based data warehouse. Data acquisition is split between events flowing through Kafka, and periodic snapshots of PostgreSQL DBs. The algorithms and data infrastructure at Stitch Fix is housed in #AWS. ![]()
0 Comments
Leave a Reply. |