BC Blog

RubyConf Taiwan 2019 - Let's Scale Ruby Applications with Love

Posted on 2019-09-01

We know observing performance is tedious and tiresome. But what if we continue to write codes that poison the repo and later become legacy which somehow results in software regression? Because of the situation like this, we all want the code itself to be more readable, explainable and performant by default, so others are able to benefit from reading it and leaning from the codebase.

By giving few real world examples on how we optimize large scaled apps in practice, I hope it will help programmers to clearly understand the step by step procedure to think of their codes with more conventional ways of approaching modern software principles. Bringing the happiness to yourself, and also to others

First Step of Creating a Well-Rounded CI/CD Flow

Posted on 2018-10-08

Long time ago, I joined my previous company PaGamO and the deployment is still raw, using git clone and some manual restart script specifically for the project, so I decided to give it a change.

The above figure is the first draft for my redesign CI/CD, also with the help of few gems like:

Of course, those are common gems used in Rails community. As a Senior Software Engineer, I always seek challenges to automate as much as I could and use tools to resolve problems ahead to improve team performance.

Besides the tech part of introducing tools and revamped workflow, keys to achieve such agreement are always about communication between teams.

Like QA needs to know about tasks at what certain point of time needs to be tested, understanding the new task assignment changes without too much communication effort.

Because a good workflow speaks for itself, everyone should know what to do when an assignment changes and what the status of the task is meant for each member. Therefore, teams can accomplish their jobs easily. As a Scrum Lead, I should always keep track of progress but not by asking team members, but using tools and job board to quickly ship and track products automatically.

Refactor Legacy Controller in a Good SOLID Pattern

Posted on 2018-09-10

The Problem

Refactoring is hard, but I believe looking at the code that is hard to read is even poisonous to the team. Code smell happens all the time, and how we refactor it not only improve the readability but also provide a good sample for incoming juniors’ career development in the company.

Let’s looks that this piece of code:

 module Api
   module Admin
     module Analytics
       class ProductsController < V1::ApiController
         def top_sold
           raise ParamsMissing, 'params missing' if params_missing
           raise InvalidDateInterval, 'range of days is not allowed' unless check_date_interval
           response = perform(
             merchant_id: current_merchant.id,
             url: analytics_params[:url]
             start_date: analytics_params[:start_date],
             end_date: analytics_params[:end_date]
           )

           render json: ::Analytics::FormatService.top_product_format(response['metrics']), status: :ok
         rescue RestClient::Exception => ex
           render json: JSON.parse(ex.response), status: :unprocessable_entity
         rescue => ex
           render json: { message: ex.message }, status: :unprocessable_entity
         end

         ...
         ...
         ...
       end
     end
   end
 end
end

Even the functions are separated, but let’s refresh the responsibility of the Rails controller

It coordinates the interaction between the user, the views, and the model. The controller is also a home to a number of important ancillary services. It is responsible for routing external requests to internal actions.

Simple and elegant. However, what we did here kind of violates the first SOLID principles which is Single Responsibility.

Look at the line 6 to 7, basically it’s doing the job besides the controller routing but to validate errors. If we want to customize and even extend more validations, might end up all the private methods in controller and action in front of action definition.

Also, line 8 to 11 also comes across a request of a third party handler, even a RestClient or HTTParty request being put here looks out of place. If we want to extend more methods like post or form requests, definitely will bloat up the private functions being defined in the controller here.

Of course, line 13 to 18 looks out of place too. We have a decorator/serializer here but strongly depend on the single attribute of the response. And other responses seem not to fit in this decorator. What if we want to change the format when the endpoint change, this looks like a violation of Open/Close Principle for me.

Let’s refresh the definition of the Open/Close Principle

1	software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification

What I Did Here

Let’s first extract the request handler to its own module

module Anaylytics
  class Request
    include Singleton

    BASE_URL = "#{SERVICES['anaylytics']['api_host']}".freeze
    URLS = {
      top_sale: 'top_sale'
    }.freeze


    class << self
      extend Forwardable

      def method_added(method)
        (class << self; self; end).def_delegator :instance, method
      end
    end

    def get(key, params: {})
      RestClient::Request.execute(
        method: :get,
        url: "#{BASE_URL}/#{URLS[key]}",
        headers: { params: params }
      )
    end
  end
end

Looks good, right? Now the request has its own handler and we can simply define methods like GET, POST of our own and extend easily.

And the singleton class with method_added being delegated, allows us not to write .instance all the time but to have the property of instance like class. This is a better thread safe management and allow us to extend some ActiveSupport methods that are specifically for instance methods.

Now let’s define our ErrorsHandler

module Anaylytics
  class Request
    module ErrorsHandler
      include AcctiveSupport::Concern
      include ActiveSupport::Rescuable

      included do
        rescue_from RequestError, with: :deny_access
      end

      protected

      class RequestError < StandardError
		 ...
		 ...
		 ...
      end

      private

      def deny_access
        ...
        ...
      end
    end
  end
end

Now we have customized errors and rescuer being defined in a single class. And adopt the use of few ActiveSupport modules to make it even cleaner like rescue_from allow us to define rescue function to its own method name.

How about ParamsMissing and DateInvalid in the original controller action? Seems not being placed here, and that’s right! The logic sounds more like a model to me and I think should deserve to have its own validator here.

module Analytics
  class Request
    class ParamsValidator
      include ActiveModel::Validations
      attr_accessor :start_date, :end_date

      INTERVAL = 3.months.freeze

      validates :start_date, presence: true
      validates :end_date, presence: true
      validate :belongs_to_interval

      def initialize(params = {})
        ...
      end

      def belongs_to_interval
        ...
      end

      def validate!
        raise ValidationError.new(errors.full_messages) if invalid?
        true
      end

      class ValidationError < StandardError; end
    end
  end

I sometimes find ActiveModel::Validations is very useful when we want to define something that’s not actually in current scope of Rails database model. And we can still adopt the use the simple and elegant validation solution to its own class.

And last we see a format service that is running wild and not unified for each response in the original action. Let’s rename it as presenter to delegate the response object.

module Analytics
  module ResponsePresenters
    class Base
      attr_reader :object, :raw

      def initialize(object)
        @object = JSON.parse(object)
        @raw    = object
      end

      def as_json(*)
        ...
      end

      private

      ...
      ...
    end

    class OtherPresenter < Base
      ...
      ...
    end
  end
end

as_json allows the object in render json: to be automatically called and transform to JSON string. It’s an example of Liskov Subsitution Principle or Duck Typing technique since we want the response-like object to be act like the response resulting from RestClient

Let’s look at our refreshed request handler

module Anaylytics
  class Request
    include Singleton

    BASE_URL = "#{SERVICES['anaylytics']['api_host']}".freeze
    URLS = {
      top_sale: 'top_sale'
    }.freeze


    class << self
      extend Forwardable

      def method_added(method)
        (class << self; self; end).def_delegator :instance, method
      end
    end

    def get(key, params: {}, presenter: Analytics::ResponsePresenters::Base)
      # Added ParamsValidator
      ParamsValidator.new(params).validate!
      response = RestClient::Request.execute(
        method: :get,
        url: "#{BASE_URL}/#{URLS[key]}",
        headers: { params: params }
      )
      # Dependencies inject
      presenter.new(response)
    end
  end
end

Since we do not need all the presenters at once, and probably only need to present/decorate the response at each request. The Open/Close Principle is quite suitable here which gives us the idea to inject the module class we want.

Finally, our end result controller action will be like

...
...
include Analytics::Request::ErrorsHandler

def top_sale
  response = Anaylytics::Request.get(:top_sale, params: {
    merchant_id: current_merchant.id,
    start_date: analytics_params[:start_date],
    end_date: analytics_params[:end_date]
  })

  render json: response, status: response.code
end
...
...

I think this example demestrate some of the SOLID principles and design patterns we can use in daily basis. Thanks for reading!

Airflow on Kubernetes Part I

Posted on 2018-08-17

Building an industrial design on Kubernetes is hard, but building on top of pre-existed application without support is even harder.

I found two repos are quite suitable for such cases

Docker Airflow

Kube Airflow

Due to the fact that Kube Airflow expects Helm chart supports for K8S in mind and less actively maintain, and later I found https://github.com/helm/charts/issues/2591 has already started an official support for the Airflow. Therfore, I decided to use simplified Docker Airflow image for deployment for now if Airflow chart is still at its early stage.

Even so, a simple Docker Airflow could cause you nightmares too. That’s why I guessed many people have forked their own versions. The Python dependencies are not specifically locked into the Dockerfile, so basically when building an image will install the latest version first which might crash Airflow application. I suggest to lock the specific version to make sure every developers will consume the same environments on their own machines. Of course, that’s what Docker meant for, isn’t it?

Since we’re deploying on Kubernetes which is already a great support for microservices on Pods and Containers distributions, keeping LocalExecutioner for docker-compose won’t reflect too much on our real deployment. Removing the LocalExecutioner for docker-compose is fine and make sure the CeleryExecutioner should work well for the local development. However, I suggest to have environment variables put into files like .env for better management of keys instead of hardcoded. Later on, we could .gitignore these files and deploy more freely for different environments setting on machines.

Switching to puma-dev on New MacOS

Posted on 2018-04-22

Since I’m getting a new Mac recently, I have realized that pow is no longer being maitained and lack of several features like SSL support and WebSockets. I think it’s time to switch for a better alternative at this time being.

Getting Puma-Dev on MacOS

Install Puma-Dev

1	$ brew install puma/puma/puma-dev

Setup

1	$ sudo puma-dev -setup

Link Puma-Dev Application

1 2	$ cd /your/app/path $ puma-dev link

Start Puma-Dev Server

$ puma-dev -d localhost

#=> * Directory for apps: /Users/user/.puma-dev
#=> * Domains: localhost
#=> * DNS Server port: 9253
#=> * HTTP Server port: 9280
#=> * HTTPS Server port: 9283

Note: Why using default .dev is an issue

SSL Certificate Issue

If you encounter SSL issue in browers, add the trusted cert in your MacOS

1	security add-trusted-cert -k login.keychain-db ~/Library/Application\ Support/io.puma.dev/cert.pem

Installing PostgresDB on Mac

Posted on 2016-01-31

Open a Termianl

As a developer, you either user iTerm2 or built-in Terminal very often. Both of them are fine.

Install Homebrew

Homebrew
We’ll will start installing Postgres through homebrew, which is a package manager for OSX. Just follow the steps on the official website.

Brew Install Postgres

Enter brew install postgres to tell brew to install PostgreSQL for you. You’ll see messages to tell you start the postgres now, but I suggest to manage our background progress in another way.

Install LaunchRocket

I believe LaunchRocket is great for managing brew-installed services, all you need to do is to click start and the service will be up. Becuase I use my personal laptop for many things, I don’t want most of the service to run automatically at startup to eat up my memory. Therefore, this is a great tool to work with.

Create a Default DB Based on Your Username

Since Postgres use the username as a default DB entrance. Type createdb 'whoami' in your terminal to make sure you can have a database to enter and test.

Try Out the psql Command

Try out psql in the terminal. If you enter the command-line interface to PostgreSQL, then you’re good to go!

Play Around with Chef Install on New EC2 Instance

Posted on 2015-12-10

GitHub Connection Issues

Not sure why server’s SSH connection use the different algoritems for GitHub connection

References:

How to Resolve:

# ~/.ssh/config Add here

Host github.com
    KexAlgorithms curve25519-sha256@libssh.org,diffie-hellman-group-exchange-sha256,diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1

Host *
    KexAlgorithms curve25519-sha256@libssh.org,diffie-hellman-group-exchange-sha256

MySQL Database Socket Path is not Default to `/tmp/mysql.sock`

How to Resolve:

1	$ mysqladmin variables -u root -p \| grep sock

MySQL Securities

Since MySQL default to root users, we should provide different roles for specific authorizations

By preventing application to access root auth, less impact will be caused by database being compromised

Colorful Terminal

1 2	# touch ~/.bash_aliases PS1='\[\033[1;33m\]\u\[\033[1;37m\]@\[\033[1;32m\]\h\[\033[1;37m\]:\[\033[1;31m\]\w\[\033[1;36m\]\$ \[\033[0m\]'

NGINX Always Showing the Default Page

How to Resolve:

cd etc/nginx/sites-available delete default file

sudo vi etc/nginx/nginx.conf delete include command under default

Building Robust API Client with Her Gem

Posted on 2015-09-28

The topic sounds a little awkward, but there’s really a gem called Her

Here’s the official documentation
http://www.her-rb.org/

Installation

In your Gemfile, add

1 2	gem "her" gem "faraday", "~> 0.8.10"

I specified the faraday version to be locked at ~> 0.8.10 because there’s something wrong the the newest version so far.

Usage

# config/initializers/her.rb
Her::API.setup url: "https://api.example.com" do |c|
  # Request
  c.use Faraday::Request::UrlEncoded

  # Response
  c.use Her::Middleware::DefaultParseJSON

  # Adapter
  c.use Faraday::Adapter::NetHttp
end

And then to add the ORM behavior to a class, you just have to include Her::Model in it:

1
2
3

class User
  include Her::Model
end

After that, using Her is very similar to ActiveRecord:

User.all
# GET "https://api.example.com/users" and return an array of User objects

User.find(1)
# GET "https://api.example.com/users/1" and return a User object

@user = User.create(fullname: "Tobias Fünke")
# POST "https://api.example.com/users" with `fullname=Tobias+Fünke` and return the saved User object

@user = User.new(fullname: "Tobias Fünke")
@user.occupation = "actor"
@user.save
# POST "https://api.example.com/users" with `fullname=Tobias+Fünke&occupation=actor` and return the saved User object

@user = User.find(1)
@user.fullname = "Lindsay Fünke"
@user.save
# PUT "https://api.example.com/users/1" with `fullname=Lindsay+Fünke` and return the updated User object

Customization

For custom primary key:

class User
  # ...
  primary_key :user_id # use user_id instead of id
end

For custom collection path:

class User
  # ...
  collection_path "user_all" # "/user_all" instead of "/users"
end

For custom association:

class User
  # ...
  has_many :comments, data_key: "comment_items", path: "comment_items"
  # "data key" represents the attribute in the user collection JSON for comments
  # "path" specifies the route for the resource ex."/users/:id/comment_items"
end

Tips for Image Migration to AWS S3

Posted on 2015-05-25

Objectives

When migrating to S3, there comes a time that you finished uploading your server images, but users continue to upload afterwards.

This is an issue for us because we don’t want to use Asset Sync all the time since it might slow down the server power because it replaces them with all the files.

A rake script might come to our rescue while we only need to run a check for server files creation timestamp and then just upload the file we need to reduce the overall syncing process of amazon command line tool.

require "pry"

namespace :s3 do
  desc "Upload recent local images to S3 (A rake helper while migrating to S3, no use after S3 completed migration)"
  task upload: :environment do
    local_path = "/home/apps/my-project/shared/public/system"
    s3_path = Rails.env.staging? ? "/my-project-staging/system" : "/my-project/system"

    image_folder_ids = Dir::glob("#{local_path}/images/*/").find_all { |f| File.mtime(f) >= 5.days.ago }
    excerpt_image_folder_ids = Dir::glob("#{local_path}/excerpt_images/*/").find_all { |f| File.mtime(f) >= 5.days.ago }
    headline_image_folder_ids = Dir::glob("#{local_path}/headline_images/*/").find_all { |f| File.mtime(f) >= 5.days.ago }

    if image_folder_ids.present?
      puts "Image IDs Updated 5 days ago"
      puts "-----------------------------"
      puts image_folder_ids
    end

    if excerpt_image_folder_ids.present?
      puts "Excerpt Image IDs Updated 5 days ago"
      puts "-----------------------------"
      puts excerpt_image_folder_ids
    end

    if headline_image_folder_ids.present?
      puts "Headline Image IDs Updated 5 days ago"
      puts "-----------------------------"
      puts headline_image_folder_ids
    end

    puts "\nUploading...\n"

    image_folder_ids.each do |folder_path|
      id = File.split(folder_path).last
      cmd = "s3cmd put -P #{folder_path} s3:/#{s3_path}/images/#{id}/ --recursive"
      puts cmd
      %x(#{cmd})
    end

    excerpt_image_folder_ids.each do |folder_path|
      id = File.split(folder_path).last
      cmd = "s3cmd put -P #{folder_path} s3:/#{s3_path}/excerpt_images/#{id}/ --recursive"
      puts cmd
      %x(#{cmd})
    end

    headline_image_folder_ids.each do |folder_path|
      id = File.split(folder_path).last
      cmd = "s3cmd put -P #{folder_path} s3:/#{s3_path}/headline_images/#{id}/ --recursive"
      puts cmd
      %x(#{cmd})
    end

    puts "Finished!"
  end
end

Tips for Improve SQL Query Performance in Rails Using Counter Cache

Posted on 2014-04-03

I recently discovered that COUNTER CACHE is a great way to improve some SQL query performance across relational tables.

Since I have TOP LINKS showing on my SIMPLE REDDIT web app, I encountered some issues to scale the top links performance with Rails. Later, I found that counter cache might be the solution

Here’s what my implementation in `pages_controller.rb` looks like

1	@popular_links = Link.includes(:category, :votes).top

I originally designed the vote model seperately which contains “up” vote and “down” vote attributes.

Relate Link Model to fake votes attributes using `class_name`

1 2	has_many :up_votes, class_name: 'Vote', conditions: { up: true } has_many :down_votes, class_name: 'Vote', conditions: { up: false }

Compare the value from highest to lowest

def self.top
  self.all.sort do |a, b|
    (b.up_votes.length - b.down_votes.length) <=>
    (a.up_votes.length - a.down_votes.length)
  end.first(6)
end

The query at this stage is quite slow because it always takes up to 200ms to complete the page before counter cache is implemented. Therefore, I followed along with the RailsCasts tutorial and tried to modify some of my code.

Add migration to up and down votes count

add_column :links, :up_votes_count, :integer, default: 0

Link.reset_column_information
Link.all.each do |link|
  link.update_attribute :up_votes_count, link.up_votes.length
end

And do the same to down votes.

Specify the cache column to Vote Model

1 2	belongs_to :link, counter_cache: :up_votes_count belongs_to :link, counter_cache: :down_votes_count

Seems easy enough! Now I have my page completed under 30ms most of the time and I won’t see lots of query showing up on my log screen. Anyway, I believe this is a great way for me to solve problems like this in the future.

The Problem

What I Did Here

Getting Puma-Dev on MacOS

Install Puma-Dev

Setup

Link Puma-Dev Application

Start Puma-Dev Server

SSL Certificate Issue

Open a Termianl

Install Homebrew

Brew Install Postgres

Install LaunchRocket

Create a Default DB Based on Your Username

Try Out the psql Command

GitHub Connection Issues

MySQL Database Socket Path is not Default to /tmp/mysql.sock

MySQL Securities

Colorful Terminal

NGINX Always Showing the Default Page

Installation

Usage

Customization

Objectives

Here’s what my implementation in pages_controller.rb looks like

Relate Link Model to fake votes attributes using class_name

Compare the value from highest to lowest

Add migration to up and down votes count

Specify the cache column to Vote Model

MySQL Database Socket Path is not Default to `/tmp/mysql.sock`

Here’s what my implementation in `pages_controller.rb` looks like

Relate Link Model to fake votes attributes using `class_name`