We know observing performance is tedious and tiresome. But what if we continue to write codes that poison the repo and later become legacy which somehow results in software regression? Because of the situation like this, we all want the code itself to be more readable, explainable and performant by default, so others are able to benefit from reading it and leaning from the codebase.
By giving few real world examples on how we optimize large scaled apps in practice, I hope it will help programmers to clearly understand the step by step procedure to think of their codes with more conventional ways of approaching modern software principles. Bringing the happiness to yourself, and also to others
Long time ago, I joined my previous company PaGamO and the deployment is still raw, using git clone and some manual restart script specifically for the project, so I decided to give it a change.
The above figure is the first draft for my redesign CI/CD, also with the help of few gems like:
Of course, those are common gems used in Rails community. As a Senior Software Engineer, I always seek challenges to automate as much as I could and use tools to resolve problems ahead to improve team performance.
Besides the tech part of introducing tools and revamped workflow, keys to achieve such agreement are always about communication between teams.
Like QA needs to know about tasks at what certain point of time needs to be tested, understanding the new task assignment changes without too much communication effort.
Because a good workflow speaks for itself, everyone should know what to do when an assignment changes and what the status of the task is meant for each member. Therefore, teams can accomplish their jobs easily. As a Scrum Lead, I should always keep track of progress but not by asking team members, but using tools and job board to quickly ship and track products automatically.
Refactoring is hard, but I believe looking at the code that is hard to read is even poisonous to the team. Code smell happens all the time, and how we refactor it not only improve the readability but also provide a good sample for incoming juniors’ career development in the company.
moduleApi moduleAdmin moduleAnalytics classProductsController < V1::ApiController deftop_sold raise ParamsMissing, 'params missing'if params_missing raise InvalidDateInterval, 'range of days is not allowed'unless check_date_interval response = perform( merchant_id: current_merchant.id, url: analytics_params[:url] start_date: analytics_params[:start_date], end_date: analytics_params[:end_date] )
render json:::Analytics::FormatService.top_product_format(response['metrics']), status::ok rescue RestClient::Exception => ex render json: JSON.parse(ex.response), status::unprocessable_entity rescue => ex render json: { message: ex.message }, status::unprocessable_entity end
... ... ... end end end end end
Even the functions are separated, but let’s refresh the responsibility of the Rails controller
1
It coordinates the interaction between the user, the views, and the model. The controller is also a home to a number of important ancillary services. It is responsible for routing external requests to internal actions.
Simple and elegant. However, what we did here kind of violates the first SOLID principles which is Single Responsibility.
Look at the line 6 to 7, basically it’s doing the job besides the controller routing but to validate errors. If we want to customize and even extend more validations, might end up all the private methods in controller and action in front of action definition.
Also, line 8 to 11 also comes across a request of a third party handler, even a RestClient or HTTParty request being put here looks out of place. If we want to extend more methods like post or form requests, definitely will bloat up the private functions being defined in the controller here.
Of course, line 13 to 18 looks out of place too. We have a decorator/serializer here but strongly depend on the single attribute of the response. And other responses seem not to fit in this decorator. What if we want to change the format when the endpoint change, this looks like a violation of Open/Close Principle for me.
Let’s refresh the definition of the Open/Close Principle
1
software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification
What I Did Here
Let’s first extract the request handler to its own module
defmethod_added(method) (class << self;self; end).def_delegator :instance, method end end
defget(key, params: {}) RestClient::Request.execute( method::get, url:"#{BASE_URL}/#{URLS[key]}", headers: { params: params } ) end end end
Looks good, right? Now the request has its own handler and we can simply define methods like GET, POST of our own and extend easily.
And the singleton class with method_added being delegated, allows us not to write .instance all the time but to have the property of instance like class. This is a better thread safe management and allow us to extend some ActiveSupport methods that are specifically for instance methods.
moduleAnaylytics classRequest moduleErrorsHandler include AcctiveSupport::Concern include ActiveSupport::Rescuable
included do rescue_from RequestError, with::deny_access end
protected
classRequestError < StandardError ... ... ... end
private
defdeny_access ... ... end end end end
Now we have customized errors and rescuer being defined in a single class. And adopt the use of few ActiveSupport modules to make it even cleaner like rescue_from allow us to define rescue function to its own method name.
How about ParamsMissing and DateInvalid in the original controller action? Seems not being placed here, and that’s right! The logic sounds more like a model to me and I think should deserve to have its own validator here.
defvalidate! raise ValidationError.new(errors.full_messages) if invalid? true end
classValidationError < StandardError;end end end
I sometimes find ActiveModel::Validations is very useful when we want to define something that’s not actually in current scope of Rails database model. And we can still adopt the use the simple and elegant validation solution to its own class.
And last we see a format service that is running wild and not unified for each response in the original action. Let’s rename it as presenter to delegate the response object.
definitialize(object) @object = JSON.parse(object) @raw = object end
defas_json(*) ... end
private
... ... end
classOtherPresenter < Base ... ... end end end
as_json allows the object in render json: to be automatically called and transform to JSON string. It’s an example of Liskov Subsitution Principle or Duck Typing technique since we want the response-like object to be act like the response resulting from RestClient
defmethod_added(method) (class << self;self; end).def_delegator :instance, method end end
defget(key, params: {}, presenter: Analytics::ResponsePresenters::Base) # Added ParamsValidator ParamsValidator.new(params).validate! response = RestClient::Request.execute( method::get, url:"#{BASE_URL}/#{URLS[key]}", headers: { params: params } ) # Dependencies inject presenter.new(response) end end end
Since we do not need all the presenters at once, and probably only need to present/decorate the response at each request. The Open/Close Principle is quite suitable here which gives us the idea to inject the module class we want.
Finally, our end result controller action will be like
Due to the fact that Kube Airflow expects Helm chart supports for K8S in mind and less actively maintain, and later I found https://github.com/helm/charts/issues/2591 has already started an official support for the Airflow. Therfore, I decided to use simplified Docker Airflow image for deployment for now if Airflow chart is still at its early stage.
Even so, a simple Docker Airflow could cause you nightmares too. That’s why I guessed many people have forked their own versions. The Python dependencies are not specifically locked into the Dockerfile, so basically when building an image will install the latest version first which might crash Airflow application. I suggest to lock the specific version to make sure every developers will consume the same environments on their own machines. Of course, that’s what Docker meant for, isn’t it?
Since we’re deploying on Kubernetes which is already a great support for microservices on Pods and Containers distributions, keeping LocalExecutioner for docker-compose won’t reflect too much on our real deployment. Removing the LocalExecutioner for docker-compose is fine and make sure the CeleryExecutioner should work well for the local development. However, I suggest to have environment variables put into files like .env for better management of keys instead of hardcoded. Later on, we could .gitignore these files and deploy more freely for different environments setting on machines.
Since I’m getting a new Mac recently, I have realized that pow is no longer being maitained and lack of several features like SSL support and WebSockets. I think it’s time to switch for a better alternative at this time being.
Getting Puma-Dev on MacOS
Install Puma-Dev
1
$ brew install puma/puma/puma-dev
Setup
1
$ sudo puma-dev -setup
Link Puma-Dev Application
1 2
$ cd /your/app/path $ puma-dev link
Start Puma-Dev Server
1 2 3 4 5 6 7
$ puma-dev -d localhost
#=> * Directory for apps: /Users/user/.puma-dev #=> * Domains: localhost #=> * DNS Server port: 9253 #=> * HTTP Server port: 9280 #=> * HTTPS Server port: 9283
As a developer, you either user iTerm2 or built-in Terminal very often. Both of them are fine.
Install Homebrew
Homebrew We’ll will start installing Postgres through homebrew, which is a package manager for OSX. Just follow the steps on the official website.
Brew Install Postgres
Enter brew install postgres to tell brew to install PostgreSQL for you. You’ll see messages to tell you start the postgres now, but I suggest to manage our background progress in another way.
Install LaunchRocket
I believe LaunchRocket is great for managing brew-installed services, all you need to do is to click start and the service will be up. Becuase I use my personal laptop for many things, I don’t want most of the service to run automatically at startup to eat up my memory. Therefore, this is a great tool to work with.
Create a Default DB Based on Your Username
Since Postgres use the username as a default DB entrance. Type createdb 'whoami' in your terminal to make sure you can have a database to enter and test.
Try Out the psql Command
Try out psql in the terminal. If you enter the command-line interface to PostgreSQL, then you’re good to go!
And then to add the ORM behavior to a class, you just have to include Her::Model in it:
1 2 3
classUser include Her::Model end
After that, using Her is very similar to ActiveRecord:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
User.all # GET "https://api.example.com/users" and return an array of User objects
User.find(1) # GET "https://api.example.com/users/1" and return a User object
@user = User.create(fullname:"Tobias Fünke") # POST "https://api.example.com/users" with `fullname=Tobias+Fünke` and return the saved User object
@user = User.new(fullname:"Tobias Fünke") @user.occupation = "actor" @user.save # POST "https://api.example.com/users" with `fullname=Tobias+Fünke&occupation=actor` and return the saved User object
@user = User.find(1) @user.fullname = "Lindsay Fünke" @user.save # PUT "https://api.example.com/users/1" with `fullname=Lindsay+Fünke` and return the updated User object
Customization
For custom primary key:
1 2 3 4
classUser # ... primary_key :user_id# use user_id instead of id end
For custom collection path:
1 2 3 4
classUser # ... collection_path "user_all"# "/user_all" instead of "/users" end
For custom association:
1 2 3 4 5 6
classUser # ... has_many :comments, data_key:"comment_items", path:"comment_items" # "data key" represents the attribute in the user collection JSON for comments # "path" specifies the route for the resource ex."/users/:id/comment_items" end
When migrating to S3, there comes a time that you finished uploading your server images, but users continue to upload afterwards.
This is an issue for us because we don’t want to use Asset Sync all the time since it might slow down the server power because it replaces them with all the files.
A rake script might come to our rescue while we only need to run a check for server files creation timestamp and then just upload the file we need to reduce the overall syncing process of amazon command line tool.
namespace :s3do desc "Upload recent local images to S3 (A rake helper while migrating to S3, no use after S3 completed migration)" task upload::environmentdo local_path = "/home/apps/my-project/shared/public/system" s3_path = Rails.env.staging? ? "/my-project-staging/system" : "/my-project/system"
if image_folder_ids.present? puts "Image IDs Updated 5 days ago" puts "-----------------------------" puts image_folder_ids end
if excerpt_image_folder_ids.present? puts "Excerpt Image IDs Updated 5 days ago" puts "-----------------------------" puts excerpt_image_folder_ids end
if headline_image_folder_ids.present? puts "Headline Image IDs Updated 5 days ago" puts "-----------------------------" puts headline_image_folder_ids end
puts "\nUploading...\n"
image_folder_ids.each do|folder_path| id = File.split(folder_path).last cmd = "s3cmd put -P #{folder_path} s3:/#{s3_path}/images/#{id}/ --recursive" puts cmd %x(#{cmd}) end
excerpt_image_folder_ids.each do|folder_path| id = File.split(folder_path).last cmd = "s3cmd put -P #{folder_path} s3:/#{s3_path}/excerpt_images/#{id}/ --recursive" puts cmd %x(#{cmd}) end
headline_image_folder_ids.each do|folder_path| id = File.split(folder_path).last cmd = "s3cmd put -P #{folder_path} s3:/#{s3_path}/headline_images/#{id}/ --recursive" puts cmd %x(#{cmd}) end
I recently discovered that COUNTER CACHE is a great way to improve some SQL query performance across relational tables.
Since I have TOP LINKS showing on my SIMPLE REDDIT web app, I encountered some issues to scale the top links performance with Rails. Later, I found that counter cache might be the solution
Here’s what my implementation in pages_controller.rb looks like
The query at this stage is quite slow because it always takes up to 200ms to complete the page before counter cache is implemented. Therefore, I followed along with the RailsCasts tutorial and tried to modify some of my code.
Seems easy enough! Now I have my page completed under 30ms most of the time and I won’t see lots of query showing up on my log screen. Anyway, I believe this is a great way for me to solve problems like this in the future.