Kaoru
Kohashigawa

Weapon of Choice


sift science benchmark siege

![Evan Kirby](https://images.unsplash.com/photo-1495435798646-a289417ece44?dpr=2&auto=format&fit=crop&w=400&q=80&cs=tinysrgb&crop=&bg=)<br> [Photo by Evan Kirby](http://unsplash.com/@evankirby2?utm_medium=referral&amp;utm_campaign=photographer-credit&amp;utm_content=creditBadge) At Sift I work on a team responsible for the uptime of one of our core products: Workflows. Unfortunately the design of the Workflows engine is complex, difficult to debug, and unstable. GC death spirals are a common outage problem. Leaders would disappear and reappear for no reason resulting in leadership discrepancies forcing us to restart the entire cluster. Most of all it was difficult to onboard new engineerings. Few of us are comfortable debugging any on-call issues most will ping us whenever there was a fire. Most of the problems stem from supporting an in-memory database and keeping state on three separate instances in sync. Moving the database to an external service simplifies a lot of things but latency is the main concern. Some believe under high load, the database is unable to keep up with the product needs. I asked a few of my mentors for their opinions, most sense latency wouldn't be an issue but I have to get hard numbers, I can't convince people with feelings and theory. ### Getting the Requirements Luckily for me Sift has metrics and standards in place for how fast a Workflows engine needs to be. All I have to do is production like requests to the system and make sure our metrics do not worsen. Sending a bunch of requests via curl and the up arrow isn't the best way to spend my afternoon, I need a tool. But first I need to define what production data looks like. 1. **Uniqueness**: Requests must come in from 20+ different customers to avoid creating [hot spots on HBase](https://archive.cloudera.com/cdh5/cdh/5/hbase/book.html#_hotspotting). Our customer ids are part of the url, therefore the tool should be able to send requests to different urls. 2. **Longevity**: Traffic must be spread over a period of time to simulate days of load, not just seconds. 3. **Concurrent**: Requests must be fired concurrently to simulate multiple clients. 4. **Volume**: Controlling the number of transactions per second will allow me to get a better idea of when the new designs hits a limit. 5. **JSON**: Requests must include a JSON payload, pretty typical for any service. ### Apache Benchmark (ab) Apache Benchmark is really good at getting top performance measurements. From my understanding it will create an configurable number of connections and try to send a configurable number of requests through those connections, sending 1 request after another. That's great for testing your application for max performance and max load. It will then spit out a nifty output file which [R can plot](https://www.r-project.org/about.html). This allows you to quickly digest your results and share it with your team. Here's out it met my criteria: 1. **Uniqueness [NOPE]**: Unfortunately ab out the gate won't fire to multiple endpoints. 2. **Longevity [SORTA]**: The `-s` option will "timeout" once the test time passes a limit. You could theoretically say send a trillion requests within 10 minutes and if your application doesn't handle the trillion requests, the benchmark run will terminate. 3. **Concurrent [YEP]**: `-c` will take a number allowing you configure the number of concurrent requests. 4. **Volume [SORTA]**: There is a `-n` options which will allow you to specify the total number of requests the test should fire, there is no option however to throttle the requests being fired. I.e. wait 200ms before firing the second request. You can achieve different levels of throughput by reducing and increasing the concurrency level, but no way to throttle requests. 5. **JSON [YEP]**: You'll have to configure the appropriate headers, but you are able to specify a text file to be the body of the request allowing you to send any sort of data. ### Siege Is really nifty tool that aims to mimic user behavior. There are a ton of options and it will even download resources on an HTML page shedding light on how your servers will behave under load. Unfortunately the output isn't as helpful and clear as ab. Nor is there any support for graphing to R. I suppose you can grab the entire output of the tests and feed it to something (**HACK PROJECT IDEA**?!?). How does it stand up to the criteria? 1. **Uniqueness [YEP]**: You can give `siege` a text file with a list of urls and it will randomly select one when it fires a request. Unfortunately there is no way to programmatically create the urls on the fly, forcing you to pre-list all possible urls. 2. **Longevity [YEP]**: Unlike ab, siege will fire requests until the configured time expires. 3. **Concurrent [YEP]**: `-c` allows you to configured the number of open concurrent connections. 4. **Volume [SORTA]**: the `-d` option will pause for a number of seconds between 1 and the number passed. This allows you to control the throughput of the system. Given 30 open connections, you're total transactions per second (TPS) will be around 30. This allows you to not overwhelm the service and control the throughput. I'm not entirely sure if i can handle fractions of a second though. 5. **JSON [YEP]**: In addition to giving siege a file to use as the body, you can even give it a list of urls with different files allowing you to mimic different request bodies. Neat right? ### Conclusion Siege fits the bill for my test case, it does take a little bit more time to setup because it gives you so many knobs. With all of the options siege allows you to get pretty close to production like traffic. In the end I was able to use siege to build a new design and convince to my team it would work in the wild. The design is live in production and has yet to fail us, knock on wood. ### Word of Caution If you're using these tools to benchmark user latency (requests coming from a browser) keep in mind where you're running the tools from. For example if your application is hosted in Virginia, latency will be terrible if you run tests on a box in Australia, unless most of your users are from Australia, then why are you hosting your application in Virginia? You can find more pitfalls and sharp edges to avoid in this great [Sonassi blog post](https://www.sonassi.com/blog/magento-kb/why-siege-isnt-an-accurate-test-tool-for-magento-performance). ## Reference: - Siege Manual<br> [https://www.joedog.org/siege-manual/](https://www.joedog.org/siege-manual/) - Apache Benchmark<br> [https://httpd.apache.org/docs/2.4/programs/ab.html](https://httpd.apache.org/docs/2.4/programs/ab.html) - Sonassi Blog Post<br> [https://www.sonassi.com/blog/magento-kb/why-siege-isnt-an-accurate-test-tool-for-magento-performance](https://www.sonassi.com/blog/magento-kb/why-siege-isnt-an-accurate-test-tool-for-magento-performance)

Tuning Postgresql Queries


postgresql database database index benchmark sift science

![Beyond the Cosmos](https://images.unsplash.com/photo-1488229297570-58520851e868?dpr=2&auto=format&fit=crop&w=400&q=80&cs=tinysrgb) <br>[Photo by Joshua Sortino](https://unsplash.com/@sortino) ### TLDR: Lessons learned - Adding an index won't always speed up a query. - PostgreSQL's Query Planner behaves differently depending on the number of rows in the database. - Sorting is more expensive than filtering. - Column order matters when creating an index. - Partial indexes can improve performance but covers less use cases. ### Always Be Benchmarking! There are no hard and fast rules when it comes to optimizing sql queries. At Sift we recently migrated out one of our system's database to PostgreSQL. PostgreSQL acts like a queue holding and sorting tasks to be worked on. Worker threads would pull tasks off the queue and update the results in the database. The queries to claim tasks from PostgreSQL is pretty straight forward: ```psql SELECT * FROM flow WHERE timeout < 1497744525100 AND state = 0 AND state = 1 ORDER BY timeout ASC LIMIT 50; --- And SELECT * FROM flow WHERE timeout > 1497744525100 AND state = 0 ORDER BY last_updated ASC LIMIT 50; ``` During benchmark tests the new design could not keep up with a throughput of a 100 tasks per second. Things started to back up in a really bad way. Throwing up metrics around the work iteration I came to find the slowest part of the iteration wasn't doing the actual work, it was getting tasks to work on! ### Get the Numbers! I ruled out latency right out the gate because these were the only 2 queries taking > 100ms to run. It also wasn't due to network bandwidth because we were pulling out 50 rows at most and the tables were rather skinny. As a sanity check I reduced the `LIMIT` to 10 to see if things improved, it didn't (it actually got worse). Best way to inspect queries in PostgreSQL? [EXPLAIN ANALYZE](https://www.postgresql.org/docs/current/static/using-explain.html#USING-EXPLAIN-ANALYZE). ```psql EXPLAIN ANALYZE SELECT * FROM flow WHERE timeout < 1497744525100 AND (state = 0 OR STATE = 1) ORDER BY timeout ASC LIMIT 50; -- QUERY PLAN -- ------------------------------------------------------------------------------------------------------------------------------------------------ -- Limit (cost=0.43..12.78 rows=50 width=88) (actual time=1790.418..1790.710 rows=50 loops=1) -- -> Index Scan using flow_timeout_idx on flow (cost=0.43..344816.13 rows=1396378 width=88) (actual time=1790.417..1790.702 rows=50 loops=1) -- Index Cond: (timeout < '1497744525100'::bigint) -- Filter: ((state = 0) OR (state = 1)) -- Rows Removed by Filter: 3028429 -- Planning time: 0.305 ms -- Execution time: 1790.764 ms ``` Fantastic! Okay so this query takes about 1.7 seconds to execute, ugh! The query plan translates to: - Using the timeout index, select anything with timeout < 1497744525100 - Select rows with state = 0 or state = 1 (Removing 3,028,429 rows) - Limit by 50 rows Hrmm, we have an index on `state`...why isn't it using that? Oh probably because it's a separate index and queries are limited to 1 index? Cool let's add a multi-column index to help it out! ```psql CREATE INDEX ON flow(timeout, state); EXPLAIN ANALYZE SELECT * FROM flow WHERE timeout < 1497744525100 AND (state = 0 OR STATE = 1) ORDER BY timeout ASC LIMIT 50; -- QUERY PLAN -- ------------------------------------------------------------------------------------------------------------------------------------------------ -- Limit (cost=0.43..12.78 rows=50 width=88) (actual time=1640.746..1641.038 rows=50 loops=1) -- -> Index Scan using flow_timeout_idx on flow (cost=0.43..344816.13 rows=1396378 width=88) (actual time=1640.745..1641.027 rows=50 loops=1) -- Index Cond: (timeout < '1497744525100'::bigint) -- Filter: ((state = 0) OR (state = 1)) -- Rows Removed by Filter: 3028429 -- Planning time: 0.379 ms -- Execution time: 1641.075 ms -- (7 rows) ``` Nope! The query planner ignored the new index, what the deuce? ### Get specific Thinking about what the query plan is trying to tell me...I think it's using the index to select the rows with timeout < 1497744525100 and it's _not_ using an index to select rows with state 0 or 1, so that means **it has to compare each row's state to the condition**. That's 3,028,429 + 1,396,378 = 4,424,807 rows!! That's a ton of rows! Okay okay, we have to compare less rows, which means leveraging the timeout index to filter out even _more_ rows. Playing around with the data more I noticed the majority of the rows in the table had a negative timeout. Soo..... ```psql EXPLAIN ANALYZE SELECT * FROM flow WHERE timeout < 1497744525100 AND timeout > 0 AND (state = 0 OR state = 1) ORDER BY timeout ASC LIMIT 50; -- QUERY PLAN -- --------------------------------------------------------------------------------------------------------------------------------------- -- Limit (cost=0.43..225.64 rows=50 width=88) (actual time=0.030..0.137 rows=50 loops=1) -- -> Index Scan using flow_timeout_idx on flow (cost=0.43..67049.45 rows=14886 width=88) (actual time=0.029..0.132 rows=50 loops=1) -- Index Cond: ((timeout < '1497744525100'::bigint) AND (timeout > 0)) -- Filter: ((state = 0) OR (state = 1)) -- Planning time: 0.381 ms -- Execution time: 0.179 ms -- (6 rows) ``` Sweet! Adding an index here didn't help us because the query planner still had to filter through millions of rows. Next! ### Index column order matters ```psql EXPLAIN ANALYZE SELECT * FROM flow WHERE timeout > 1497744525100 AND state = 0 ORDER BY last_updated ASC LIMIT 50; -- QUERY PLAN -- --------------------------------------------------------------------------------------------------------------------------------------------------- -- Limit (cost=116966.60..116966.72 rows=50 width=88) (actual time=1206.017..1206.034 rows=50 loops=1) -- -> Sort (cost=116966.60..118528.06 rows=624586 width=88) (actual time=1206.015..1206.025 rows=50 loops=1) -- Sort Key: last_updated -- Sort Method: top-N heapsort Memory: 32kB -- -> Index Scan using flow_state_idx on flow (cost=0.43..96218.30 rows=624586 width=88) (actual time=0.140..831.204 rows=1239683 loops=1) -- Index Cond: (state = 0) -- Filter: (timeout > '1497744525100'::bigint) -- Rows Removed by Filter: 501 -- Planning time: 0.328 ms -- Execution time: 1206.083 ms -- (10 rows) ``` Here's the breakdown of what's going: - Using an index on state get all the rows where state is 0. - Compare and select rows where timeout is greater than 1497744525100 (501 rows were removed). - Sort 624,586 rows by last_updated. - Limit by 50. We can actually see what most of the time is being spent on by looking at `actual time`. Sorting thew 624,586 rows in memory takes up the most time! Gross. Let's add an index! ```psql CREATE INDEX on flow(timeout, state, last_updated); EXPLAIN ANALYZE SELECT * FROM flow WHERE timeout > 1497744525100 AND state = 0 ORDER BY last_updated ASC LIMIT 50; -- QUERY PLAN -- --------------------------------------------------------------------------------------------------------------------------------------------------- -- Limit (cost=116966.60..116966.72 rows=50 width=88) (actual time=1179.820..1179.835 rows=50 loops=1) -- -> Sort (cost=116966.60..118528.06 rows=624586 width=88) (actual time=1179.819..1179.829 rows=50 loops=1) -- Sort Key: last_updated -- Sort Method: top-N heapsort Memory: 32kB -- -> Index Scan using flow_state_idx on flow (cost=0.43..96218.30 rows=624586 width=88) (actual time=0.077..810.652 rows=1239683 loops=1) -- Index Cond: (state = 0) -- Filter: (timeout > '1497744525100'::bigint) -- Rows Removed by Filter: 501 -- Planning time: 0.451 ms -- Execution time: 1179.879 ms -- (10 rows) ``` Uhh wait what?! It's not using the index at all! Digging around the internet it appears the order of the columns in an index matters. The most expensive operation this query has to do is sort, so if it can sort first then match the conditions I bet things will look a lot better! ```psql CREATE INDEX on flow(last_updated, timeout, state); EXPLAIN ANALYZE SELECT * FROM flow WHERE timeout > 1497744525100 AND state = 0 ORDER BY last_updated ASC LIMIT 50; -- QUERY PLAN -- ---------------------------------------------------------------------------------------------------------------------------------------------------------------- -- Limit (cost=0.56..38.47 rows=50 width=88) (actual time=140.145..140.243 rows=50 loops=1) -- -> Index Scan using flow_last_updated_timeout_state_idx on flow (cost=0.56..473538.62 rows=624586 width=88) (actual time=140.143..140.230 rows=50 loops=1) -- Index Cond: ((timeout > '1497744525100'::bigint) AND (state = 0)) -- Planning time: 0.357 ms -- Execution time: 140.296 ms -- (5 rows) ``` Huzzah! Way faster, yet... not fast enough, we're aiming for < 50ms. ### Partial Index Poking around the internet some more I stubbled on Heroku's blog post, "[Efficient Use of PostgreSQL](https://devcenter.heroku.com/articles/postgresql-indexes)". > A partial index covers just a subset of a table’s data. It is an index with a WHERE clause. The idea is to increase the efficiency of the index by reducing its size. A smaller index takes less storage, is easier to maintain, and **is faster to scan**. For this particular query we can create an index that mirrors the exact `WHERE` clause which in turn will make the index smaller and the scan blazing fast! ```psql CREATE index scheduler_query_idx ON flow (last_updated, timeout, state) WHERE timeout > 0 AND state = 0; EXPLAIN ANALYZE SELECT * FROM flow WHERE timeout > 1497744525100 AND state = 0 ORDER BY last_updated ASC LIMIT 50; -- QUERY PLAN -- ------------------------------------------------------------------------------------------------------------------------------------------------------------- -- Limit (cost=0.43..24.68 rows=50 width=88) (actual time=0.077..0.189 rows=50 loops=1) -- -> Index Scan using flow_last_updated_timeout_state_idx1 on flow (cost=0.43..302935.79 rows=624586 width=88) (actual time=0.075..0.180 rows=50 loops=1) -- Index Cond: (timeout > '1497744525100'::bigint) -- Planning time: 2.255 ms -- Execution time: 0.227 ms -- (5 rows) ``` Dang....that's what I'm talking about! ### Knowing the Trade Offs Having an understanding of what PostgreSQL is doing under the hood is key to understanding the trade offs you're making to accomplish your goal. Blindly throwing indexes at the problem doesn't always work and will sometimes bite you in the end. The partial index we created above will likely only benefit one query. Other parts of the system will have to pay a cost in order to maintain the index. But it's worth the price because this query powers the heart of our work engine and is mission critical. ### References - PostgreSQL Performance tips <br>[https://www.postgresql.org/docs/9.6/static/performance-tips.html](https://www.postgresql.org/docs/9.6/static/performance-tips.html) - Heroku: Efficient Use of PostgreSQL <br> [https://devcenter.heroku.com/articles/postgresql-indexes](https://devcenter.heroku.com/articles/postgresql-indexes) - Understanding Execution Plans <br> [http://www.vertabelo.com/blog/technical-articles/understanding-execution-plans-in-postgresql](http://www.vertabelo.com/blog/technical-articles/understanding-execution-plans-in-postgresql) - Create a similar database to follow along using<br> [https://github.com/KaoruDev/learning_psql/blob/master/create-test-flow.sql](https://github.com/KaoruDev/learning_psql/blob/master/create-test-flow.sql)

Dummy Data with PSQL


postgresql database psql

I'm gonna be honest, almost 2 weeks have gone by and I haven't finished my blog post on Postgres indexes. Mainly because I want to reproduce the behavior I saw. Alas I could not because Postgres' query planner behaves differently based on your query **and the data in your database**. Because interesting things happen when you database is filled with rows, I had to fill err up! That in of itself was an adventure and learning. ### How not fill up your database with dummy data Closest tool to my hand was Rails. Yeah, it's not ideal. It's single threaded and is geared more towards usability than speed. Whatever, I didn't have to get off my fat butt and dig in the shelf (internet) for another tool. Off I went to create 5 million rows using ActiveRecord. I wrote a seed file which looped 5 million times generating random data for each row. It took 3+ hours. I actually walked away did errands and came back to find my computer sleeping, that lazy...I had to change the settings on my computer and re-run the seed file. This was _after_ I had remembered that migration files run in transaction blocks. **UGH** That's right! I forgot I initially tried this with a migration! When my computer went to sleep, the transaction threw an error, making it as if nothing happened. Yeah it was so horrible I forgot I did that until now.... ### PSQL to the rescue! While reading up on indexes I remember reading a blog posts which had a snippet on how to insert rows into Postgres to then run `EXPLAIN` against. I ignored it at the time because that wasn't what I was looking for, I'm such a tool. Welp I can't find it now cause that's just my luck so I had to go dig through the internet to learn this, one stackoverflow question / answer at a time..._le sigh_ Here's what my test table looks like: ```sql Table "public.flow" Column | Type | Modifiers --------------+-------------------+----------- run_id | character varying | not null event_uuid | character varying | not null state | smallint | not null last_updated | bigint | not null timeout | bigint | not null ``` I'll need to create random text for `run_id` and `event_uuid` a random integer for `state` and epoch times for `last_updated` and `timeout`. Remember, I'm trying to create data as close to production as possible. Times need to be relatively close to today. `timeout` should be within the next 2 weeks, `last_updated` within the last week. Okay here's how I did it so you don't have to scroll all the way to the bottom: ```sql INSERT INTO flow SELECT run_id, md5(random()::text) AS event_uuid, round((random() * 5)) AS state, round(extract('epoch' FROM now() - random() * (now() - timestamp '07-10-2017')) * 1000), round(extract('epoch' FROM now() - random() * (timestamp '07-24-2017' - now())) * 1000) FROM generate_series(1,5000000) AS run_id; ``` Still interested eh? Good cause now I'll explain what's going on! #### Create fake text / ids When you need to create random strings use `md5` + `random()`. `random()` will give you a double from 0.0 - 1.0. `::text` will cast the double into text so that you can hash it, giving you a random string! ```sql SELECT md5(random()::text); md5 ---------------------------------- 31687e13ac4e58491a8c745bc1c9c188 (1 row) ``` #### Extracting epoch `timestamp` is great because it'll generate a timestamp for you without a lot of work. Sucks if you're using epoch as time though! Not really cause that's what `extract` is for. ```sql SELECT EXTRACT('epoch' FROM now()); date_part ------------------ 1497725140.25024 (1 row) ``` #### Generate lots of data The secret sauce is definitely in `generate_series`. This will create a series of values from the first argument to the second. ```sql SELECT generate_series(1, 5); generate_series ----------------- 1 2 3 4 5 (5 rows) ``` ### How fast is it? Running this script (gen-data.sql): ```sql SELECT now(); INSERT INTO flow SELECT run_id, md5(random()::text) AS event_uuid, round((random() * 5)) AS state, round(extract('epoch' FROM now() - random() * (now() - timestamp '07-10-2017')) * 1000), round(extract('epoch' FROM now() - random() * (timestamp '07-24-2017' - now())) * 1000) FROM generate_series(1,5000000) AS run_id; SELECT now(); ``` Takes about 1.5 minutes on my 2013 MacBook Pro. ```sql flow_runner_test=# \i gen-data.sql now ------------------------------- 2017-06-17 12:35:49.284048-07 (1 row) INSERT 0 5000000 now ------------------------------- 2017-06-17 12:37:03.664021-07 (1 row) ``` **AMAZING** ![](https://media.giphy.com/media/3FXZxuQZaepRS/giphy.gif) ## References - Math functions in Postgres: [https://www.postgresql.org/docs/9.6/static/functions-math.html](https://www.postgresql.org/docs/9.6/static/functions-math.html) - Create Random strings: [https://stackoverflow.com/a/4566583](https://stackoverflow.com/a/4566583) - Generate series: [https://www.compose.com/articles/postgresql-series-random-with/](https://www.compose.com/articles/postgresql-series-random-with/) - Inserting random characters: [http://www.postgresql-archive.org/How-to-insert-random-character-data-into-tables-for-testing-purpose-THanks-td5680973.html](http://www.postgresql-archive.org/How-to-insert-random-character-data-into-tables-for-testing-purpose-THanks-td5680973.html)

Back at it


sift science growth communication java

![Muhammad Masood](https://images.unsplash.com/photo-1492005844208-658251ca63de?dpr=2&auto=format&fit=crop&w=300&q=80&cs=tinysrgb&crop=&bg=)<br> [Photo by Muhammad Masood](https://unsplash.com/@muhammadbinmasood) It's been about a year since I've published anything. I have a few drafts on ice and they're all so sad. Sift Science has kept me busy learning and growing. I'm learning so much but keep failing to write about it, which is a problem. Writing helps me reinforced what I've learn and gives me a great reference point. There are countless times I've used my [regex post](https://www.kaoruk.com/posts/02-25-2016-regexing) to do back and forward matches. I can only imagine what I'll forget in the future...probably a lot. I'm going to promise future posts I'm planning to write: - Discovering race conditions at Sift - Learning from scripting - Bash pipe commands - Using Postgresql as a queue - Index Learnings That is 5 posts! Seriously not getting this done immediately but I promise to get one a week if not at the very least one every 2 weeks. We use so many tools at Sift it's really exciting to learn about all the different systems and technologies and actually use them on a daily basis. - HBase - Kafka - ZooKeeper - ElasticSearch Oh and I write in Java now, never thought that would be a thing. I took a [Udemy](https://www.udemy.com/java-the-complete-java-developer-course/) course at the beginning of 2017 and haven't looked back. I surprisingly enjoy writing in Java. I especially love having mature tools to leverage concurrency. Intellij has been key to learning the language and makes it way easier to write code. Sift has exposed an area of weakness of mine, communication. My message sometimes fails to be clear enough for everyone to understand. This was pretty surprising because I've mentored a handful of students in the past and haven't had this problem. The difference is when I mentor folks, the idea in my head is clear, I know the problem, I know the solution, I've tested. For work I'm in the middle of understanding the problem or testing out a few solutions and nothing is really clear in my head. I've been starting my slack conversations with `===== spit balling` to indicate that I'm rubber ducking and to prove that I'm making some progress on the problem. I want to also add a commenting system on this site now that I got SSL going woot! Stay tuned...

Taste of Awesome


culture binti

![Trees after a fire](https://images.unsplash.com/photo-1442473483905-95eb436675f1?crop=entropy&fit=crop&fm=jpg&ixjsv=2.1.0&ixlib=rb-0.3.5&q=80&w=900) [Photo by Dikaseva](http://dikaseva.com/?utm_medium=referral&utm_source=unsplash) On Monday I re-entered the job market. Turns out international / domestic adoption is a lot more difficult than any of us thought. The founders, instead of pretending things would be alright and fake it, decided to go back into research mode. **Great call**, even if I got the short end of the stick. I don't regret joining the team. In my 1.5 months of being there I got to experience a great culture. ### Culture of Criticism One of the things that attracted me to Binti was the _culture of criticism_. I would meet with my team members weekly or bi-weekly and discuss things that we could improve upon. There was a real desire to improve. It was just about improving technical skills but also soft skills. Things that make you a better team member, a better person. Our industry, I believe, needs more of that. ### Impact It was awesome having a bigger goal than myself. I felt like I was making a real difference in the world. I wasn't fixing a minor inconvenience for the privilege ( although I admit I enjoy those benefits ), I was helping [create families](https://binti.com/about/). ### Open It felt really nice to be able to share my opinions and thoughts without repercussion. There were real discussions weighing pros and cons on certain issues. There was no top down marching orders, without argument. There were orders, but it was well discussed, allowing all parties to share their input. Even if we did not agree. I'm still in awe of the aspiration of Felicia and Julia. To take on the beast that is adoption. I hope they do not fail. I got a taste of what it's like to be part of a company that _seriously_ invests in the development of their employees. I want to continue to work for companies of the same caliber.