Monday, 13 November 2017

Two failed modifications of Paxos: towards a better understanding of the protocol

Two failed attempts at modifying Paxos.

The Paxos algorithm solves the following problem:
There are n processes.  Each process starts with a random value of 0 or 1.
Processed communicate via messages.  Messages might be lost or arbitrarily delayed.
At some time in the future, all processes need to decide on one of the two value.  There must be a way to find out when the decision has been made and what value it is.  Furthermore, to prevent trivial solutions, the decision outcome must be a subset of the starting values.  (I.e., if all processes start with 0, they cannot ever decide on 0).  Once a decision is reached, it must never change.

The following algorithm solves it assuming that a majority of servers can communicate with each other.  If they cannot, it just means that a decision will not be reached (liveness), there won't be a case where a decision is reached but later changed (safety).

Each machine keeps a local counter counter and also is assigned a unique server_id offline before everything starts.
Each server tries to make proposals:
1. Send a "prepare-to-propose" message to all machines (or a majority) containing its ballot number which is the pair (counter, server_id).  Increase its counter.
2. When it receives reply from a majority, do the following:
Each reply should contain a (max_ballot_accepted, and accepted_value) but might be null.  If everyone in the majority is null, then send propose message containing the ballot (counter, server_id) with any value.  However, if not everyone in the majority replied null, then sort by max_ballot_accepted, and propose that particular accepted_value.

Each server should also listen for prepare-to-propose message and proposals.  It should also keep track of (max_ballot_accepted), max_ballot_promised, and accepted_value.
1. Upon receiving prepare-to-propose, if the ballot is smaller than max_ballot_promised, ignore, if it is greater then, update max_ballot_promised and reply with (max_ballot_accepted, and accepted_value).
2. Upon receiving a proposal, check to see if it is greater or equal to max_ballot_promised, if so, accept the proposal and update our state.

When is consensus reached.  Consensus is reached when a majority of acceptors accepted the same ballot.  We can find out when that happens if all the processes for example, send out a status update whenever a proposal is accepted.  Of course, the status update might be lost and there is no way of knowing whether the message is lost.  But we are guaranteed that the consensus value will never change once it is changed (safety).

Some intuition on why it works:
1. The prepare-to-propose serves as a barrier to ensure that any accepted ballots will be "known" by later ballots.
2. By only proposing the largest ballot from a majority, we ensure that newer proposed values will remain the consensus value if consensus have been reached.

Once some majority M has accepted a ballot, then any proposer proposing higher ballot numbers must not have sent the "prepare-to-propose" message to this majority until after the acceptance.  Otherwise the prepare-to-propose message servers as an obstacle to getting this particular ballot accepted in at least one common member.
Lower ballot numbers also suffer the same issue.
Now, after receiving reply to prepare-to-propose messsages from a majority M', M' has at least one member in common with M.  We can do a strong induction on the ballot number.  If the current ballot is n, we can assume all proposed ballots between the original accepted ballot and n-1 (inclusive) all propose the same consensus value.  The common member between M' and M would have replied a ballot number at least as great as the original accepted ballot number.  The proposer must propose the value corresponding to the largest ballot number, hence it must also propose the consensus value.

To help us understand better, let's ask whether we can modify the protocol in two ways.
1. Instead of sending out a "prepare-to-propose" message which extracts a promise not to accept lower numbers, can we simply query for proposed values instead of asking for a promise.  We will block out lower ballots at the acceptance step.
The issue is that ballots with the same number might override the consensus value.
We could reach a state each row represents one process, their accepted ballot and value)
1, v1
1, v1
1, v1
2, v2
null
Consider the following sequence of messages (p1 stands for process one).
p1.PrepareToPropose(ballot 1).SendTo(1, 2, 3)
p2.PrepareToPropose(ballot 2).SendTo(1, 2, 3)
p1.Received.PrepareResponse(null, null, null).Propose(1, v1).SendTo(1, 2, 3)
(p1, p2, p3).Accept(ballot 1, v1)
p2.Received.PrepareResponse(null, null, null).Propose(2, v2).SendTo(4) // messages to other members of the majority are lost
p4.Accept(ballot 2, v2)

This is happening because p2 does not know that ballot 1 is already in progress.

2. Is it possible to declare that consensus is reached as soon as a majority has picked the same value?  Right now, we require majority to pick the same ballot.
No.  After majority picked the same value, the consensus value might not remain the same.
We again describe accepted ballot, accepted number, put also show the internal "promise" counter.
To simplify we only show global state transitions.  The user can fill in the messages and lost messages.

1: v1 (1)
1: v1 (1)
2: v2 (2)
2: v2 (2)
null (2) (proposal is lost, but prepare was received)

1: v1 (3) // proposal three only reaches the fifth process
1: v1 (3)
2: v2 (2)
2: v2 (2)
3: v1 (3) // here, we are wrong to think that consensus has been reached

Proposer now prepares and gets response from p1, p2, p3.  It will now propose v2.

4: v2 (4)
4: v2 (4)
4: v2 (4)
2: v2 (2)
3: v1 (3) // consensus value changed

Sunday, 18 September 2016

Reducible state space Markov chains

While reviewing a paper last night, I started thinking about reducible state space Markov chains.  Most undergraduate probability students are fed theorems about irreducible Markov chains.  To summarize these theorems.


  • If two states communicate, then they are either both recurrent or both transient.
  • If a MC is irreducible and all states are positive recurrent, then there exists a unique stationary distribution.
  • If a MC is irreducible, aperiodic, and has a stationary distribution, then it converges to that.

We can ask a set of questions about reducible Markov chains to test our intuition for the subject.
  • Let's setup a trivial 4 state Markov chain where the first two and last two states form separate communicating classes.
  • How many stationary distributions exist?
  • Do limiting distributions exist?
  • Does the limit distribution depend on the initial distribution?  How many possibilities are there?

Let's try to answer them.
  • There are an infinite number of stationary distributions.  If you take the stationary distribution of the first class and the stationary distribution of the second class, then any linear combination (equal to 1) of these two would be a stationary distribution would be a stat. distribution of the full 4-state Markov chain.
  • Limiting distributions do exist.  Let's think about the intuition here.  If the total prob. mass on the first two states is m_0 and the rest is 1-m_0, then starting from this initial distribution, each iteration of the chain would "stir" this vector but only stir within their respective classes.  So the first two states of this vector would converge to its limit distribution and same with the last two.  This means the overall limit distribution would be a mixture of the individual limit distributions weighted by the (m_0, 1-m_0) vector.

Wednesday, 19 December 2012

Course Catalog: Classical Composition

Description: This is a course on classical music composition that will teach the principles of melody, harmony, and orchestration.  However, this course does not teach you to form you musical ideas.  That comes from your talent.  If you don't have it, you can become an orchestrator.  The distinguished alumni of this course usually breaks with the principle we teach them and write great music nevertheless.  (Shostakovich, Debussy)

Course Credit: None

Pre-Requisites: No technical pre-requisite.  However, one should try to live a life of calamity, solitude, and mental instability through the achievement of at least one of the following:


  1. Unfulfilling / unrequited love for a string of women who are outside of your social class, for example members of the aristocracy or the royals.  See Beethoven.
  2. If you are homosexual, the unfortunate need to oppress your orientation due to social pressures of your society.  See Tchaikovsky.
  3. Have a high infant mortality rate.  See Bach.
  4. Losing of either vision or hearing.  (See Beethoven or Rodrigo)
  5. Live under a repressive Communist dictatorship (Shostakovich)
  6. Lastly, having your wife engaging in an affair with one of your best friend.  (Schumann)



Tuesday, 18 December 2012

AI-assisted political science

In the wake of the Sandy Hook shooting, there has been much debate on the radio about the gun policy in the United States.  The gun advocates stress that banning ownership of guns will not take it out the hands of criminals and decrease violent crime.  Both sides of the debate totes different statistics in order to support their own side.  During the presidential debates, there were numerous occasions where numbers were presented and its truthfulness contested by Obama and Romney.  One instance that comes to mind was when the factual accuracy of Romney's tax policy for the middle class was in dispute.

I think this is one area where information retrieval / data science /weak AI (doesn't mean crappy AI, but just to distinguish it from the well-defined Strong AI term) can really contribute to the public sphere.  Imagine a Google or Wolfram Alpha type program that parses the content of a debate and automatically checks for factual accuracy: gun crime rate, economic data, demographics, tax rates, tax codes etc, perhaps via an overlay on the screen.  This really would sway public discussions from "what are the facts" vs "what is the right policy".  It also prevents cheating and mis-representation of facts.

To push this idea further, imagine an AI-assisted data retrieval / analytics program for the common masses.  To take an example, I wanted to draw a simple correlation graph of gun ownership rates by country vs homicide rate by country.  Just to answer this one simple question, I had to manually enter / merge two separate tables found on Wikipedia, and then filter the data for missing entries using MATLAB, and perform a plot.  This set of computer skill, in the short future, is beyond the common knowledge of the average citizen, yet they need these types of questions answered concretely in order to vote on the right candidate / policy for this country.  How about a more advanced search  / analytics that would, given a question in natural language format, e.g. "Has the crime rate in NYC decreased after 1990?", it would automatically retrieve & clean the relevant data and perform the relevant statistical analysis.  (Regression, classification, etc.)  Currently Google / Bing can answer factual questions, but they are still one step away from answering analytical questions.

Sunday, 25 November 2012

Richard Dawkins on the evolutionary basis of morality

I was just watching a Richard Dawkins lecture at Randolph-Macon Women's College on YouTube.  He was asked a question about the basis of atheistic morality.  Dawkins proposed that our moral sense comes from the fact that our pre-historic ancestors lived mostly with next-of-kins hence 1) altruism will benefit the preservation of similar genes 2) there's a high probability of being in long-term contact with these individuals who can then reciprocate the good will.
I would propose that altruism serves to improve upon another evolutionary survival objective, the preservation of the species.  When pre-historic human beings lived in tribes, there was a high risk of destruction from other animals, diseases, and natural elements.  Hence at some point, competition to extinguish a rival member of the same species is out-weighed by the risk that the size of the species might fall below a critical threshold and threaten extinction.  For example, you don't want to kill Adam for stealing your girlfriend because he and you can work together when the tigers attack tomorrow night.
If we think about tribes and animal social organizations as entire units, then that unit would be evolving to maximize its survival as a whole.  If a society randomly generated a moral sense that does not include any altruism in an "attempt" to maximize individual benefits, that group as a whole (and consequently the individual) might have a lower probability of survival because the survival benefits that are derived from the group would be lost (i.e. strength in numbers).  In another words, when the survival of the individual is linked to a certain critical mass and critical health of his immediate social organization unit, this species would likely evolve some moral sense due to the pressure of natural selection even though this moral sense, on the surface, appears to be anti-beneficial to the individual upon a first-order examination.

Tuesday, 9 October 2012

Similarity between music performance and muay thai

I'm writing a somewhat satirical comparison between music performance (mainly violin) and muay thai.

  1. In violin, you practice boring etudes and scales in order to improve your fundamentals so you can better play your beautiful pieces.  In Muay Thai, you practice boring conditioning, mitt, and heavy bag drills in order to improve your athleticism and technique so you can better hold out in your fights.
  2. In violin, technique alone is not enough.  At a high enough level, there's a deep, almost unteachable, element of artistic creativity and musicianship that brings the techniques alive in a coherent work of musical art.  In Muay Thai, technique alone is not enough.  At the high level, you need a great sense of strategy, movement, and mental determination that is almost unteachable to bring all your techniques together in a work of martial art.
  3. In violin performance, you "practice-perform" your pieces by getting nervous and playing in a Masters class.  In Muay Thai, you "practice-fight" by sparring with partners.
  4. Before a violin performance, you become frigging nervous about messing up.  Before a Muay Thai bout, you become frigging nervous about losing and getting messed up.
  5. In a Muay Thai fight, if you mess up you might get embarrassed and/or knocked out.  In a music performance, if you mess up you might get embarrassed and/or knocked out.
  6. When you're too old to be a travelling musician, you become a teacher in a music school.  When you're too old to be fighting professionally, you become a coach in a gym.
  7. If you cannot make a living performing music, you can always teach the instrument.  If you cannot make a living fighting Muay Thai, you can always become a coach or personal trainer.

Saturday, 23 April 2011

Practical use for characteristic function & electrical engineering technique in evaluating sines

Yesterday I was thinking about the uniqueness of the heat equation in both the Cauchy setting and the Dirichlet boundary setting.  While thinking about this, the mathematics came down to computing the convolution of the heat kernel with the initial condition which is periodic with fundamental domain $[0,L]$.
After while, I needed to compute the convolution of the heat kernel with a sin function.  In probabilistic terms, I need to compute
$$ E[sin(X)] \; X \sim N(\mu, \sigma) $$
Brute forcing the expected value is very difficult. Then i realized that we can use the good old characteristic function if we represent $sin(X)$ as $\frac{e^{iX}-e^{-iX}}{2i}$
If we convolve a sine signal with some kernel, how do we know the output is still a sine signal?  Well if the kernel is symmetric (even), that means the Fourier transform will be even.  Since sine has a spectrum consisting of two Dirac delta functions at $+\omega$ and $-\omega$, then the output of the convolution will be another sine function of a different magnitude.