Factors Blog

Insights Across All Things B2B Account Intelligence & Analytics
All
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Intuition can only take us so far: Fun with Factors (Part 2)

March 9, 2021
0 min read

Continuing with our series on “Fun with Factors” (please find the first part here), we had another session on “Intuition can only take us so far”, wherein we discussed how non-intuitive concepts such as irrational numbers are very much real. Furthermore, we established the importance of grounding ideas to their bare-bones structure, lest we confuse ourselves and fall into paradoxes.

The Irrational Route

For a number to be rational is to possess the ability of being expressed in the form of a fraction -- or the well-known p-by-q (p/q). Now, just for completeness, recall that ‘p’ and ‘q’ should be integers. And ‘q’ should be non-zero.

That said, is it not easy to see that every number is rational? What’s the big deal? Wait, prepare to be challenged! You need to prove (or disprove) that the square root of 2 (i.e., √2) is a rational number. Oh, I heard you! You say √2 an "imaginary" concept with no practical existence. Smart; you took the challenge to another level! So let’s first see how √2 looks like, and how it’s very real!

Take a square piece of cloth ABCD, each side of which measures 1 m. Now cut it into two pieces along one of its diagonals (say, AC). What you get are two right-angled triangles ABC and A’DC’. Let’s take one of them -- ABC. How much do its sides measure? We know AB = 1 m and BC = 1 m; but AC = ?.

The Irrational Route

Following Pythagoras’ advice, we could compute AC = √(AB² + BC²) = √(1+1) = √2. Bingo! We have a triangular cloth with one side measuring √2 metres. But you might object! “Why √2? I used a ruler and measured it to be 1.414 m.” Are we in a fix? Not yet. Analytically, we have AC = √2, but on measuring it using a ruler, we get 1.414. One can deduce that the value of √2 is 1.414. That is a smart move because if you could prove that, you would have √2 = 1.414 = 1414/1000, a rational number indeed! Let us see.

So what sorcery is this entity called √2? Simply speaking, it’s the number whose square should be 2. So, we should expect the square of 1.414 to be 2. Alas! It turns out that 1.414² = 1.999396, a little short of 2, isn't it?

Never mind, you procure a better ruler with more precise scale markings and measure the diagonal side of the cloth (AC) to be 1.41421356237 m. But on squaring it, we get 1.41421356237² = 1.9999999999912458800169, again, short of 2.

The fact of the matter is that no matter how precisely you measure the value of √2, it’s inexpressible as a fraction. But how do I convince you of that? You should demand a proof. A proof that √2 is not a rational number.

Let’s see what we could do:

Assume √2 to be a rational number; and let’s give this assumption a name: "The Rational Root Assumption" (TRRA). Now, if TRRA were to be true, we should be able to find two integers p and q such that √2 = p / q. In addition, let us demand p and q to meet a condition: that they have no common factors except 1. Let us call this the “no common factors” condition (NCFC). Now, “√2 = p/q” simply means that p = q√2, or p² = 2q². As soon as you multiply something by 2, the product becomes an even number. So we have 2q² to be an even number, and hence p² is an even number as well. This leads to our first conclusion: that p is an even number (because if it were not, then it would be odd, and if it were odd, then p would be equal to 2k+1 for some integer k, and this would mean (2k+1)² = 4k²+4k+1 = 2(2k²+2k) + 1 would be odd, and so would p² be, which is not possible since we showed p² is even). Let’s call it the “p is an even number” conclusion (PENC). But what does PENC mean? That p could be written as 2m for some suitable integer m. Let’s replace this in the equation p² = 2q². We get (2m)² = 2q², or 4m² = 2q² or q² = 2m². Oh, we have seen this before. This means q² is even, and hence q is even (for reasons made clear above). Let us call this the “q is an even number” conclusion (QENC).

The summary of the foregoing discussion is this: [TRRA and NCFC] implies [PENC and QENC]. In other words, if √2 is a rational number with numerator p and denominator q, and p & q have no common factors, then both p and q are even numbers. Wow, isn't that hard to believe, because how could p and q be even and not have any common factors? If they are even, they would have 2 as a common factor. Now, this is what we call a contradiction! And since the logical flow was flawless, there is only one explanation to the contradiction: the TRRA assumption -- that √2 is rational. Hence, we have proved that √2 is irrational. Period!

Was this discussion easy to follow? Yes.

Was it easy to write? No, because we had used wholesome English words to express the proof.

In fact, proofs are best expressed using shorthand symbols. To illustrate, the following would be a shorter version of the same argument:

To prove √2 ∉ .

Proof: Assume √2 ∈ .

⇒ ∃ p, q ∈ with p⊥q and q ≠ 0 s.t. √2 = p/q.

⇒ p² = 2q² ⇒ p²|2 ⇒ p|2 ------------------> (1)

⇒ m ∈ Z s.t. p = 2m ⇒ (2m)² = 2q² ⇒ q² = 2m² ⇒ q²|2 ⇒ q|2 ------> (2)

Now from (1) and (2) above, we have p|2 and q|2.

⇒ p⊥q is not true. Hence, we have a contradiction.

So, √2 ∉ . Hence, proved.

So √2, after all, is an irrational number and hence could not be written as a fraction of two integers.

Impossible Probabilities

To find the probability of an event is to measure something. And the prerequisite to make measurement possible is to define what to measure. Imagine what happens if what you want to measure is not well-defined. When asked to compute the conversion ratio of a campaign, your first question is to seek what the definition of a conversion event is. Let us understand the importance of defining concepts explicitly and clearly with the following example from the book on Probability and Statistics by Vijay K. Rohatgi et al, referred to as one of Bertrand’s paradoxes.

Question: A chord is drawn at random in the unit circle. What is the probability that the chord is longer than the side of the equilateral triangle inscribed in the circle? 

To understand the question more clearly, consider the circle as follows.

A chord is drawn at random in the unit circle

We have a circle (in red) centered at C with radius r = 1. Inscribe into it an equilateral triangle PQR (blue). If we now randomly draw a chord on this circle (call it chord AB), what is the probability that it is longer than the side (say s = PQ = QR = RS) of the triangle PQR?

Do you see any problem in the question formulation? If no, then you might be surprised to know that there are at least three solutions depending on how one defines the concept “a chord at random”.

Solution 1: Every chord on the circle could be uniquely defined by its end-points. Let us fix one of the end-points -- A -- on the circumference of the circle. This also defines a unique inscribed equilateral triangle APQ. The choice of the other end-point (B) dictates the length of the chord AB.

If B lies on the arc between A and P (Case 1 below), we get a chord shorter than the side of the triangle. Similar is the case when B is chosen on the circumference of the circle between A and Q (Case 2 below). But when we choose B to be somewhere on arc PQ (Case 3), we get a longer chord. 

Solution  for  A chord is drawn at random in the unit circle

Hence, we have the favourable points that could act as B (i.e., in a way that AB is longer) to be points on the circumference between points P and Q (Case 3). Now, since points A, P, and Q divide the circumference of the circle into three equal arcs AP, PQ, and AQ. We have length(arc AP) = length(arc PQ) = length(arc AQ) = 2𝜋/3. Hence, we get the desired probability as length(arc PQ) / circumference = (2𝜋/3) / 2𝜋 = 1/3.

Solution 2: Another way in which the length of a random chord is uniquely determined is by the distance of the chord’s midpoint from the circle’s centre. If we fix a radius OC, we would have an equilateral triangle PQR cutting OC at S. Moreover, length(OS) = length(SC) = length(OC) / 2 = 0.5. Our problem could be solved by picking a point X on OC and drawing a perpendicular line AXB as a chord.

Solution2  for  A chord is drawn at random in the unit circle

Now, where that X is picked decides how long the chord would be. If X is picked on line SC, we have a shorter chord; and the same done on line OS gives a longer one. So our favourable region to pick X is line OS. In other words, the desired probability would be length(OS) / length(OC) = 0.5 / 1 = 1/2.

In conclusion, we have that the same question has two solutions -- 1/3 and 1/2 -- based on our interpretation of the concept of a “random chord”. If you refer to the book, there is another solution that gives a probability of 1/4. This shows how important the exercise of “defining” a concept could be.
At Factors, we support the philosophy of crunching numbers (rather than intuition) to provide intelligent marketing insights, which are only a click away for you to experience: click here to schedule a demo with us. To read more such articles, visit our blog, follow us on LinkedIn, or read more about us.

Intuition can only take us so far: Fun with Factors (Part 1)

Analytics
January 25, 2021
0 min read

“Trust your intuition; it never lies.”, a saying most of us have heard and might strongly agree with. But at Factors this week, things were quite different when we had a session on “Intuition can only take us so far”. The idea was to relook at known concepts -- concepts we use more often than not -- and reimagine their implications from different perspectives. This article is an account of the one-hour discussion. We associate the word “factors” with different concepts at different times. Here, we associate it with maths!

Mathematics: Sturdy yet fragile

We started with the following story from “How Mathematicians Think” by Willian Byers:

A mathematician is flying non-stop from Edmonton to Frankfurt with Air Transat. The scheduled flying time is nine hours. Sometime after taking off, the pilot announces that one engine had to be turned off due to mechanical failure: "Don't worry -- we're safe. The only noticeable effect this will have for us is that our total flying time will be ten hours instead of nine." A few hours into the flight, the pilot informs the passengers that another engine had to be turned off due to mechanical failure: "But don't worry -- we're still safe. Only our flying time will go up to twelve hours." Sometime later, a third engine fails and has to be turned off. But the pilot reassures the passengers: "Don't worry -- even with one engine, we're still perfectly safe. It just means that it will take sixteen hours total for this plane to arrive in Frankfurt." The mathematician remarks to his fellow passengers: "If the last engine breaks down, too, then we'll be in the air for twenty-four hours altogether!"

Well, from basic math knowledge, you might find the next number in the sequence 9, 10, 12, 16 to be 24. Here’s how you find it. The first four numbers could be broken down as follows:

9 = 9

10 = 9+2⁰

12 = 9+2⁰+2¹

16 = 9+2⁰+2¹+2²

Pretty clearly, the next number in the sequence has to be 9+2⁰+2¹+2²+2³ = 24.

But does that mean the plane will stay in the air for 24 hours? No. It has only four engines. And if the last one breaks down too, the pilots would either perform an emergency landing or, in the unfortunate case, it would lead to a fatal crash. This shows both the strength and the fragility of maths. While in the first four cases, we could accurately measure how long the journey would take, as soon as the conditions are changed (i.e., gliding into the air instead of being thrusted by engines), the dynamics of motion change too.

Intuition could misdirect

Following is an example the “professor of professors”, Prof. Vittal Rao had given in one of his talks: Imagine you have some identical coins you are supposed to distribute among some identical people. How would you do that? Or more mathematically: In how many different ways P(n) can you distribute n identical coins to any number of identical people? Let us understand the problem by taking cases:

n = 1

  • The only way to do that is to give it to a single person:  o.  Hence, P(1) = 1.

n = 2

Distribute 2 coins. Here are two different ways:

  • You either give both coins to one person:  oo
  • Or you take two people and hand them a coin each:  o|o

Hence, P(2) = 2.

n = 3

Distribute 3 coins. What do you think P(3) should be? If P(1) = 1, P(2) = 2, we could expect P(3) to be 3, right? Let’s see.

  • ooo
  • oo|o
  • o|o|o

And 3 it is! Hence, P(3) = 3.

n = 4

Now this drives our intuition even further. The sequence we have seen until now has been 1, 2, 3. So it’s natural to assume P(4) to be 4. Let us enumerate all cases again.

  • oooo
  • ooo|o
  • oo|oo
  • oo|o|o
  • o|o|o|o

We have 5 ways to distribute 4 coins -- this beats our intuition. We get P(4) = 5.

n = 5

With new information in hand (i.e., the sequence being 1, 2, 3, 5), we could update our intuition and say this matches the Fibonacci sequence, and expects it to follow 1, 2, 3, 5, 8, 13, ... Let’s see what happens with 5 coins in hand:

  • ooooo
  • oooo|o
  • ooo|oo
  • ooo|o|o
  • oo|oo|o
  • oo|o|o|o
  • o|o|o|o|o

We get P(5) = 7 (not 8 as we had expected).

n = 6

Now what? We could now turn to a different logic: They are either odd numbers (barring the extra ‘2’) following 1, 2, 3, 5, 7, 9, 11, …,  or prime numbers (barring the extra ‘1’) following 1, 2, 3, 5, 7, 11, 13, ..., giving P(6) to be either 9 or 11 respectively. Taking n = 6, we have:

  • oooooo
  • ooooo|o
  • oooo|oo
  • ooo|ooo
  • oooo|o|o
  • ooo|oo|o
  • oo|oo|oo
  • ooo|o|o|o
  • oo|o|o|oo
  • oo|o|o|o|o
  • o|o|o|o|o|o

That’s 11 ways! The prime-number logic worked.

n = 7

Going by the same logic, we would expect P(7) to be 13 (the next prime number). Now, if you would go on and calculate it, we would have P(7) to be, in fact, equal to 15 (please go ahead and enumerate them).

In fact, it turns out that the sequence P(n) expands as follows: 1, 2, 3, 5, 7, 11, 15, 22, 30, 42, 56, 77, 101, 135, 176, 231, 297, 385, 490, etc. You could take a moment and think about it intuitively, but chances are rare that you would come up with the following formula:

approximating P(n), where we have:

The foregoing formula was derived by the well-renowned mathematician Srinivasa Ramanujan (along with G. H. Hardy). This illustrates the fact that intuition could take us only so close to the solution, and formal maths might have to be invoked in some cases.

At Factors, we support the philosophy of crunching numbers (rather than intuition) to provide intelligent marketing insights, which are only a demo away for you to experience. To read more such articles, visit our blog, follow us on LinkedIn, or read more about us.

Find the next article in this series here.

What's next in Big Data and Analytics? (Part 2)

Analytics
August 12, 2020
0 min read

In the previous blog, we very briefly went over the history of Big Data Technologies. We saw how databases evolved from relational databases to NoSQL databases like Bigtable, Cassandra, DynamoDB etc with the rise of internet along with development of technologies like GFS, MapReduce etc for distributed file storage and computation. These technologies were first developed by companies like Google, Amazon etc and later picked up in a big way by the open source community.

Big data technologies

Big Data and Enterprises

Soon enough commercial versions of these open source technologies were being distributed by companies like Cloudera, Hortonworks etc. Traditional enterprises started adopting these technologies for their analytics and reporting needs.

Prior to this enterprises built data warehouses which were actually large relational databases. It involved combining data from multiple databases of ERP, CRM etc and build an unified and relatively denormalized database. Designing the data warehouse was complex and required careful thought. Data was updated periodically. Updation involved a three stage process of extracting data from various sources, combining and transforming these to the denormalized format and loading it into the data warehouse. This came to known as ETL (Extract, Transform and Load).

With adoption of Hadoop, enterprises could now just periodically dump all their data into a cluster of machines and run ad-hoc run map reduces to pull out any report of interest. Visualization tools like Tableau, PowerBI, Qlik etc could connect directly to this ecosystem, making it seamless to plot graphs from a simple interface, but actually done by crunching large volumes of data in the background.

Customer Centric View of Data

Databases are a final system of record and analytics on databases only gives information on the current state of customers and not how they reached here.  With the rise of internet a lot of businesses are now online, or have multiple digital touchpoints with customers. Now it's easier to instrument and collect customer data as a series of actions, be it clickstream or online transactions. This customer centric model of data enables richer analytics and insights. Additionally the data is incremental, and can be made available immediately in reports, instead of being updated only periodically. More enterprises are moving to this model and datastores and technologies that cater specifically to these kind of use cases are actively being developed like TimescaleDB, Druid, Snowplow etc.

So what’s next?

To summarize, the bulk of the big data revolution, that has happened in the last 15 years, is to build systems capable of storing and querying large amounts of data. The queries are raw i.e if X and Y are variables in the data and x1 and y2 are two corresponding values of interest, then the system can return all data points where in the variable X matches x1 and Y matches y2. Or some post processed result on all the matching data points. Along the way, we also have systems that can compute on large amounts of data in a distributed fashion.

So what’s next in analytics from here? Is it building machine learning models? Certainly, the availability of all these data, enables organizations to build predictive models for specific use cases. In fact, the recent surge of interest in machine learning has actually been because of the better results we get by running the old ML algorithms at larger scale in a distributed way. While most ML techniques can be used to build offline models to power predictive features, it is not useful in the context of online or interactive analytics. Most techniques are particularly designed for high dimensional unstructured data like language or images, where the challenge is not only to build models that fit well on seen data points, but also generalizes well to hitherto unseen data points.

Datastores that make sense of data

The next logical step would be datastores and systems that can make sense of data. Making sense of data would mean that instead of blindly pulling out data points such that variable X is x1 and Y to y2, it should also be able to interactively answer different class of queries like

  • Give the best value for variable Y,  that maximizes the chance that X is x1.
  • Find all the variables or combination of variables, that influence X most when X is x1.

Such a system would continuously build a complete statistical or probabilistic model as and when data gets added or updated. Models would be descriptive and queryable. The time taken to infer or answer the different class of queries should also be tractable.  But just like there are a host of databases each tuned differently for

  • Data Model
  • Scale
  • Read and Write Latencies
  • Transaction guarantees
  • Consistency, etc

We could possibly have different systems here tuned for

  • Assumptions on Data Model
  • Accuracy
  • Ability to Generalize
  • Scale of the data
  • Size of the models
  • Time taken to evaluate different types of queries.

Autometa - is one such, first of it’s kind, system that we are building at factors.ai. It continuously makes sense of customer data to reduce the work involved in inferring from data. Drop in a mail to hello@factors.ai to know more or to give it a try.

Big Data and Analytics - What's next? (Part 1)

Analytics
August 6, 2020
0 min read

Apache Hadoop, Hive, Map reduce, TensorFlow etc. These and a lot of similar tems come to mind when some one says Big Data and Analytics.  It can mean a lot of things, but in this blog we will restrict it to the context of - analytics done on relatively structured data, collected by enterprises to improve the product or business.

When I started my career as an engineer in Google around a decade back, I was introduced for the first time to MapReduce, Bigtable etc in my first week itself. These were completely unheard of outside and seemed like technologies accessible and useful to only a select few in big companies. Yet, within a few years, there were small shops and training institutes springing up to teach Big Data and Hadoop, even in the most inaccessible lanes of Bangalore.

It’s important to understand how these technologies evolved or rather exploded, before we dwell upon the next logical step.

Dawn of time

Since the dawn of time (or rather the unix timestamp), the world was ruled by Relational Databases. Relational Databases are something that most engineers are familiar with. Data is divided into (or normalized) into logical structures called tables. But these tables are not completely independent and related to each other using foreign keys. Foreign keys are data entries that are common across tables.

Take the example of data from a retail store.  The database could have 3 tables, one for the Products it sells, one for Customers of the store and one for Orders of the products bought in the store. Each entity can have multiple attributes and is stored in different columns of the corresponding table. Each data point is stored as rows in the table. The Orders table contains entries of products bought by different customers and hence related to both Products and Customers table, using the columns product_id and customer_id.

1 index

Few implications of this structure are

  • Since each data unit is split across tables, most updates would involve updating multiple tables at once. Hence transaction guarantees are important here, wherein you either update all the tables or none at all.
  • Data can be fetched almost any way you want. For example, we can fetch all orders bought by a specific customer or all customers who bought a specific product. Additional indices can be defined on columns to speed up retrieval. But since data is split across tables, it sometimes could involve costly joins when matching the related items across tables.

SQL (Structured Query Language) became the de facto standard to query these databases and thus SQL databases also became the namesake for relational databases. These served the needs of all enterprises. As the data grew, people moved to bigger and better database servers.

Rise of Internet

Then in the 90’s there was the internet. One of the limitations of the SQL database is that it needs to reside in one machine, to provide the transactional guarantees and to maintain relationships. Companies like Google and Amazon that were operating at internet scale realized that SQL could no longer scale to their needs. Further, the data model did not need to maintain complex relationships.

If you were to store and retrieve the data unit as a whole, rather in parts across tables then each data unit is self contained and independent of other data. The data can now be distributed to different machines, since there are no relationships to maintain across machines.

Google for instance wanted to store and retrieve the information about a webpage only by it’s url and Amazon product information by product_id. Google published a paper on Bigtable in 2006 and Amazon on DynamoDB in 2007, of their inhouse built distributed databases. While DynamoDB stored data as key value pairs, Bigtable stored data by dividing data into row and columns. Lookups can be done by row key in both databases, but in Bigtable only the data in the same column family were co-located and could be accessed together. Given a list of rows and columns of interest, only those machines which held the data were queried and scanned.

2 index

Now you no longer needed bigger and better machines to scale. So the mantra changed from bigger and super machines, to cheap or commodity hardware with excellent software. And since hardware was assumed to be unreliable, the same data had to be replicated and served from multiple machines to avoid loss of data.

Open source projects soon followed suit. Based on different tradeoffs of read and write latencies, assumptions in the data model and flexibility when retrieving data we now have plethora of distributed databases to choose from. HBase, MongoDB, Cassandra to name a few. Since these databases were not relational or SQL they came to be known as NoSQL databases.

Related Big Data Technologies

This fundamental change in databases also came with auxiliary changes on how data was stored and used for computation. Most data is stored on files. But now, these files should be accessible from any of the machine. These files could also grow to be very large. And files should not be lost when a machine goes down.

Google solved it by breaking files into chunks of almost equal sizes and distributing and replicating these chunks across machines. Files were accessible within a single namespace. A paper on this distributed file system called GFS was published way back in 2003. Bigtable was infact built on top of GFS.

Distributed databases allowed you to access data only in one way (or a couple of ways) using keys. It was not possible to access data based on the values present inside the data units. In SQL you can create index on any column and access data based on the values in it. Take the example of Google storing web pages, you could access information about a webpage using url cnn.com (row key). Or you could get the links in a given webpage using rowkey (cnn.com) and a column key (links). But how do you get urls of web pages that contain the word say “Captain Marvel”.

So if the data needed to be accessed in a different way, it had to be transformed, such that data units that are related to each other by the values it holds come together. The technology used to do that was Map-Reduce. It had two phases - First it loads the data in chunks into different machines. All the urls of pages that contain the word “Captain Marvel” are sent to other process called Reducer, which collects and outputs all the matched urls. It usually requires pipelines of map reduces for more complex data transformation and joining data across different sources. This MapReduce framework was generic enough to perform various distributed computation tasks and became the de facto standard for distributed computing. The paper on MapReduce was published by Google in 2004.

3 index

Yahoo, soon took cue and developed and open sourced these technologies, which we all know as Hadoop, later adopted by Apache.  Now if Map-Reduces can be used to transform data, it could also be used to retrieve data that match a query.  Technologies like Apache Hive, Dremel, BigQuery etc were developed, which allowed user to fire SQL queries on large amounts of structured data, but the results were actually delivered by running Map Reduces in the background. An alternative to loading data into a different machine and then compute on top of it, is to take computation closer to where the data reside. Frameworks like Apache Spark, were developed broadly on this philosophy.

In the next blog, we will see some of the current trends of these technologies and discuss on how we think the these will evolve.

FactorsAI + Segment: Easy and instant analytics to drive growth

Product
August 6, 2020
0 min read

We are excited to announce our integration with Segment, further enabling companies to easily instrument user interactions across platforms and push different types of customer data, from any 3rd party source in realtime to FactorsAI.

FactorsAI + Segment Integration

FactorsAI provides advanced and intuitive analytics for marketers and product managers, to help drive growth. With FactorsAI you get immediate insights to optimize marketing campaigns, improve conversions and understand user behaviours that drive feature adoption and retention.

A good analytics setup requires detailed tracking of user actions like page views, Signups, AddedToCart with different attributes. The quality of insights on user behaviour shown by FactorsAI is dependent on the level of detail in tracking. With Segment integration this is a one time setup and you could send the same events to other tools for marketing automation, CRM etc.

Further with Segment integration, you can send data from different data sources like email, livechat which will send events like Email Delivered, Email Clicked, Live Chat Started etc. These additional events are useful when analyzing user conversions and by using Segment it can be done without the need to write custom code to hit our API’s.

Segment can perform all data collection tasks for FactorsAI. It can capture all the data that FactorsAI needs and sends it directly to FactorsAI in the right format, all in real-time. So, if you are on segment, you can now start getting insights on how to grow your customer base in no time.

To integrate with Segment, follow the steps here. Happy Analyzing!

LinkedIn Marketing Partner
GDPR & SOC2 Type II
See Factors in action
Schedule a personalized demo or sign up to get started for free

Let's chat! When's a good time?