Factors Blog

Insights Across All Things B2B Account Intelligence & Analytics
All
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Attribution is Broken (Part II): Too Many Cooks in the Kitchen

Analytics
August 16, 2021
0 min read

The following post is the second part of our “Attribution is Broken” series.

Here’s a link to the introductory post if you’re interested.

I recently came across an Instagram ad for a shiny new pair of noise-cancelling headphones. Being the mindless sheep I am, I decided that I needed a pair. So after some light research involving a few customer reviews and price comparisons, I went ahead and bought them. From start to finish, the purchase process took me about an hour or so. Admittedly, the headphones set me back a little but who cares? I can always return them if I’m not happy right? This was a short and sweet journey that’s easily digestible by most multi-touch attribution tools. And yet, this journey takes quite the turn when marketers want to reach out to businesses instead. 

B2B purchase decisions are tricky affairs. They involve complex high-value contracts, lengthy sales cycles that stretch over several months, and limited scope for backtracking once confirmed. As a result, all B2B purchases — especially those made in technology — are critical decisions. So, to mitigate the risk of making poor purchases, organisations include multiple stakeholders across multiple departments over multiple levels of seniority in their decision-making process. As an unfortunate consequence, however, this involvement of heterogeneous stakeholders tremendously complicates the account’s journey from awareness to purchase. 

Here’s a simple example of a complex B2B sales cycle:

HubForce, a promising CRM start-up takes out a couple of ads on Linkedin and Facebook.   They also publish content in the form of blogs and host interactive webinars on a regular basis. Additionally, HubForce’s SDR team requests demo meetings from CSOs, Demand Gen VPs, and Project Managers on a daily basis through outbound emails.

Ali, who is project head at Drifter (a leading chatbot service provider), receives one such mail. Ali happens to be in the market for a CRM tool and schedules a demo with HubForce. HubForce’s sales head, Vinay, walks Ali through the several technical features they have to offer. This includes HubForce’s ability to integrate with Drifter’s current tech stack and a cutting-edge AI tool that automates a lot of Ali’s grunt work. Ali is impressed and wants to onboard Hubforce. However, he needs to run the purchase decision by his CEO, Anaiya, before making it official.

Upon hearing Ali’s rave reviews, Anaiya is curious to learn a little more about HubForce.       She reads a couple of their blog posts and digs up a few reviews written by existing customers. Being a fastidious CEO, Anaiya also schedules a follow-up meeting with Vinay. This time around, Vinay demonstrates what HubForce can bring to Drifter’s revenue and sales pipeline. Rather than zone in on technical details, Vinay focuses on HubForce’s big-picture gains instead.  Anaiya likes what she sees but wants to discuss their budget constraints with her finance chief, Albert, before signing on the dotted line.

During their weekly catch-up, Anaiya fills Albert in on the HubForce deal — specifically the pricing details. Albert isn’t thrilled. He’s of the opinion that Drifter would be overpaying for what’s essentially a roided-out excel. Upon hearing this, Anaiya decides to put the deal on hold until next quarter. During this time, Albert is frequently targeted by HubForce ads on Linkedin. He even attends one of Hubforce’s webinars on their cutting-edge, AI-powered CRM technology. Eventually, Albert is convinced of the value that the CRM platform could bring to Drifter.

As the next quarter rolls around, Ali, Anaiya, and Albert discuss the deal one last time. They weigh the pros and cons and arrive at a unanimous decision to purchase a HubForce subscription. Congratulations you guys!

Clearly, the previous purchasing process was far more complex than the case of the headphones. A nuanced web of back and forth interactions had to take place before the deal could be closed. As a marketer looking to replicate this process in a scalable manner, multi-touch attribution is your go-to tool. Attribution modelling empowers marketers to unravel their intricate customer journeys, and understand the performance of nearly every marketing activity. Attribution reveals, to a large extent, what campaigns are working, and what campaigns aren’t. In turn, marketers can make data-driven resource allocations across their marketing activities. All that being said, attribution isn’t without its challenges when it comes to dealing with multiple stakeholders.

Across the length of the previous example, HubForce depended on a variety of content, strategies, and channels to get their deal across the line. They had to sell different aspects of their products to different types of audiences. Project managers may care about practical details like integration, accessibility, and time-saving. CEOs may be interested in high-level gains like ROI, pipeline, and revenue. Finance heads want to know that they’re getting the best possible price. On top of all this, each position is filled by individuals with their own motivations and preferences. The one-on-one demo clearly worked for Ali, but Anaiya chose to perform some background research as well. Albert, on the other hand, was convinced after a couple of targeted ads and a relevant webinar. All these variables contribute to the challenges of B2B attribution:

The B2B Buyer Dichotomy

B2B marketers engage with individual contacts through personalised emails, targeted ads, etc. However, the purchase decision ultimately involves a buying committee. In the example discussed above, there are three stakeholder groups that make up the buying committee- the core buying group (Ali and his project team), the group that focuses on negotiating terms (Albert and his finance team), and finally, the group which exercises the final approval (Anaiya, the CEO).

The core buying group initiates the process by identifying the need for the product, ideates on the potential solutions, and looks for options. The group that negotiates the terms will focus more on protecting the company’s interests. This involves the members from teams like legal and finance. Lastly, the final approval stakeholder group has the final say or authority. The focus of this group is to look at the company’s larger aims and strategy implementations. 

The marketer has to align these diverse internal stakeholders during the sales journey.

Different Strokes for Different Folks

Now that the different internal stakeholders within the buying committee have different core focuses, the marketer needs to adjust their approach to each group depending on what they care about. For instance, in our example, finance cares more about the pricing, while the CEO cares about the revenue and ROI, and finally, the marketing team would care about metrics like conversions, pipeline, etc.

In addition to this, the sales cycle is often complicated and non-linear. Complex B2B purchases such as enterprise software, have a lot more information for the buying committee to consider. This process becomes more drawn out with the complexity of the solution and the presence of alternatives. The multiple stakeholders in an account who have different preferences and objectives, may revisit the various stages of the buying process non-sequentially and sometimes, simultaneously. The stakeholder behavior can also be loopy where they may switch between being interested to not interested to being interested again, as we saw in our example.

Each stakeholder group keeps referring to each other in non-linear learning loops before they come to the final decision of moving forward with the purchase or not.

Invisible Touchpoints

The touchpoints in our sales cycle are of different types. While digital ads, reviews, page views are visible, there may be some that are invisible. Attribution models trying to map stakeholders might be unable to account for these touchpoints. For instance, in our HubForce example,  the finance head, who was not entirely on board with the CRM purchase, attends a webinar which finally leads to the deal being won. Data issues can arise if your CRM and marketing automation data are not flowing properly. In this case, if the impact of the webinar has not been stitched in the sales journey.

Today, most B2B marketers employ a single attribution model across a fixed timeline to derive insights from their campaign data. Sure, this approach is easy, quick, and uncomplicated. But it is also dangerously inaccurate. The issues brought on by the involvement of several stakeholders (Heterogeneous preferences and objectives, long sales cycles, loopy (back and forth) behavior of interest, and a diverse range of touchpoints) render simple attribution modelling ineffective. Instead, marketers should aim to treat each group of users independently and attempt to learn what works best for each one of them. This involves parsing out each type of customer and individually employing the appropriate model. This approach allows you to ask nuanced questions and derive genuinely actionable insights. Of course, this is a far more advanced process than an all-encompassing approach — but it’s infinitely more accurate as well. 

So what’s the solution for implementing incredibly advanced attribution models? 

Well, an incredibly advanced attribution platform of course! 

Learn more about Factors.AI cutting-edge attribution here.

Attribution is Broken (Part I)

Analytics
July 30, 2021
0 min read

In 1908, Henry Ford introduced the Model-T to the world with a full-page advertisement in Life magazine. The print ad read like an article and was chock-full of technical jargon by design. Back then, a marketer’s function was straightforward — inform all potential customers of the existence and superiority of the product. Who you were marketing to wasn’t half as important as what you were marketing. As long as buyers in the market were aware of the Model-T’s vanadium steel chassis and four-cylinder engine, Ford’s marketing team could sleep well at night knowing they had done their jobs.

Of course, the role of the marketer has evolved *a little* since then. At the time, print ads were one of the few viable communication channels available to marketers. There was also a stubborn focus on the product itself — with little thought given to what worked for each customer. Owing to years of progress in marketing technology and a radical shift towards customer centricity, marketers today have a lot more to think about. Recent digital transformations have empowered marketers with dozens of channels: social media, email, blogs, videos, podcasts, websites, etc.  In turn, they’re able to reach potential customers with content that’s specifically tailored to them. 

On the other side of the equation, digital transformation has also provided customers with far more control. Relevant market information (product details, reviews, alternatives) is instantly accessible to potential buyers. And when your competitors are a single click away from you, there is no room for complacency. As a result, the modern marketer must go above and beyond traditional information distribution. Today, the four staple functions performed by marketers are: 

  • Delivering predictable pipeline and revenue 
  • Building the company’s brand 
  • Developing long-term growth initiatives 
  • And empowering the sales team 

Still, as marketing has evolved in terms of technology and practice, analysing data and deriving insights have grown increasingly complex as well. While marketers are able to design sophisticated multi-channel campaigns, determining the basic metrics — what’s working, what’s not, which campaigns to invest in, etc. — can become tricky. Here’s an example to illustrate this: 

Gendesk, a help desk software start-up, takes out advertisements on Youtube and Facebook. Deepti, a customer success VP, stumbles upon the YouTube ad while trying to watch a video of a sleep-talking cat. She takes notice of Gendesk and clicks through to their website. Though she likes what she sees, she forgets to sign up for a demo. Later that week, Deepti comes across the Facebook ad while scrolling through her feed. This time, she ensures to schedule a call and finds the product to be a great fit. After discussing with her team, Deepti decides to make the purchase.  

As a marketer, this is great news. But when you’re looking to repeat this process in a scalable manner, a key question to ask yourself is “Which ad do I credit for the purchase decision?” Though there are cases to be made for each ad, the right answer is a subtle combination of both. Identifying this combination of credit, or in other words; determining the values to attribute to the various touch-points along the customer journey is now the holy grail of marketing analytics.     

Enter: Marketing Attribution

The previous example was based on a highly simplified customer journey — one customer and two channels. In reality, marketers target several types of customers and employ several different channels to engage with their audience. What’s more is that the buyer’s journey is almost never a linear path. Deepti may well have stumbled upon the youtube ad, visited Gendesk’s website, interacted with their chatbot, reviewed the pricing page, read a blog about the product, and clicked back to the website before coming across the Facebook ad and making his purchase. Marketing attribution is a tremendously powerful system that determines these various touch-points along the customer journey and attributes a percentage value to each one of them.   

Okay, but why’s marketing attribution so important anyway?  

“The reality is that marketing has become THE most efficient way to accelerate growth in our digital economy. The imperative is to connect the dots, so each marketing expense dollar is aligned and reported against revenue growth.”

- Paul Albright of Captora. 

A well-oiled marketing attribution system can result in efficiency gains of up to 30%. At its core, attribution modeling enables marketers to allocate resources in a strategic manner. Marketers can ensure that they’re actively driving conversions by optimizing their spending based on data-driven metrics. Zendesk’s marketing team, for example, can use a variety of attribution models to derive an understanding of what campaigns are working, and what campaigns aren’t. Accordingly, they can make evidence-based decisions on where to invest and what to alter. Ultimately, this results in a notable rise in ROI, a stronger grasp of SEO/SEM, and an improved alignment between marketing and sales. On average, marketers employ at least 6 communication channels to reach their customers today. As this number continues to rise, attribution will only become increasingly critical to the success of modern marketing initiatives. 

________

All that being said, marketing attribution isn’t without its challenges. In fact, even after the emergence of highly effective multi-touch models, several organizations continue to report attribution manually through spreadsheets. 

There are many considerations that go into choosing the right attribution model which can present several challenges for the marketer:

The Sales Cycle: 

Attribution is a lagging indicator. It takes time and patience to see if models are working. Based on the length of the sales cycle, the effects of a new campaign or changes made into existing ones will reflect much later into the future.

Ease of Set-up and Implementation: 

30% of companies in the UK say that they have chosen their current attribution model based on ease of use. If put in a position to choose between a model that is easy to implement and a complex model that would be tedious for the team to implement, marketing heads would prefer the simpler model. Similarly, technological limitations may also hinder the execution and implementation of attribution models. 

A Culture of Data and Measurement: 

To be able to value the insights provided by attribution models, there needs to be a culture of measurement and accuracy within marketing teams.

Communication of Insights: 

Communicating the insights from the model is significant for communicating cost justification as well as for taking action based on the insights from attribution. To get funds and approvals for software costs, and implementation costs in terms of time, effort, and training, the team needs to be able to communicate the insights well and accurately.

Attribution to Improve, Not Prove: 

Marketers often use attribution to prove that campaigns are working. As mentioned in the earlier section, this is important to be able to justify costs. However, limiting attribution to this purpose can lead to lost insights and higher costs. Attribution, at its core, is directional in nature. Attribution models can be used to see what is working well and also to check what is not working and needs to be abandoned. Marketing and Sales teams are often working on several kinds of campaigns and this is a useful tool to see which campaigns are performing better and can be emulated in future projects.

Volume bias: 

Most often, an organisation’s highest volume campaign can show up as its most successful campaign if marketers do not track other metrics like conversion rate and win rate. To understand, let’s consider the example of an organisation that sells CRM software to businesses. Say in the last six months, they saw a total of 500 downloads, out of which 400 were attributed to Campaign A which was implemented in the form of in-person promotional events like webinars while the remaining 100 were attributed to Campaign B which was implemented in the form of ads on YouTube and Instagram. By themselves, these numbers make it seem like Campaign A was the more successful campaign. But what if we find that the 400 downloads were made by customers from a total of 10,000 attendees in those in-person events while the remaining 100 from the second campaign were made by customers out of a total of 500 users who were presented with the ads. So if we look at the conversion rates for Campaigns A and B, we see that they were 4% and 20% respectively. This comparison could possibly give us the insight that if Campaign B was promoted further, with more funds and effort directed towards it, the organisation might’ve seen more downloads of its software with the it’s higher conversion rate relative to Campaign A.

Absence of predetermined hypotheses: 

To get effective insights from an attribution model, marketers need to be specific about what they’re trying to measure. For example, say the conversion rate for leads from campaign X within the period of the last 30 days since it went live for geographic location Y- can be used to understand if a campaign was successful within the target audience from that location. If marketers do not know what exactly they are looking for, they will end up giving an overall attribution report and miss out on gainful insights.

Invisible touchpoints: 

Several attribution models being used by organisations do not account for certain important touchpoints. Models that do not track the relationship between online activity and offline sales may lead to digital signal bias. For eg. one might have seen the ad for a clothing app on Instagram but they decide to go to the store and purchase the item. Models that do not include sales touches may not include the impact of sales actions. On one hand, it may hamper the accuracy of the outcome metrics and on the other, it may cause disarray with the sales teams instead of aiding collaboration between the two teams.

In order to choose the right attribution model for your team and reap the benefits that attribution brings to modern marketing, marketers need to be wary of these challenges and address them.

In further blog posts, we will be exploring the various challenges of attribution that we have outlined here in greater detail.

Revenue Marketing: New and Improved

Marketing
July 27, 2021
0 min read

I recently came across an article that placed a great deal of emphasis on getting your definitions right. Of course, ‘defining’ things — roles, processes, objectives — holds plenty of value. From providing clarity and purpose to qualifying breakthrough ideas, a good definition can help teams go a long way in reaching their goals. And yet, even the most precise definitions are bound to change

With that in mind, this post discusses the elements that define the new and improved Revenue Marketer. In particular, we explore six pillars of Revenue Marketing and highlight the value of data, technology, and organisational alignment in effectively driving revenue growth.

But first, let’s quickly run over the fundamentals of Revenue Marketing.

Like many others, I learned about the term 'Revenue Marketing’ through Dr. Debbie Qaquish. About 10 years ago, during a transition from a long career in sales to a role in marketing, her CEO sat across her desk and posed a single question: “What are you going to do about revenue?” Long story short, this set off the development of a significant approach that transforms marketing teams from flowery cost centers to high-performing revenue machines. This approach, we've come to know as ‘revenue marketing’.

“Revenue marketing is the combined pillars of strategies, processes, people, technologies, content, and results across marketing and sales that drop leads to the top of the funnel, accelerates sales opportunities through the pipeline, and measures marketing based on repeatable, predictable, and scalable contribution to pipeline, revenue, and ROI” 

Phew. 

That was a mouth full. 

Now don’t get me wrong; this continues to remain the foundation upon which Revenue Marketing is built. But back then, the market looked very different from what it is today. We’ve had major changes that mandate an updated definition of revenue marketing. Accordingly, here are three additional challenges that redefine what it means to be a revenue marketer today.

Challenge #1 - Digital transformation

In 2011, the average number of technologies available to the marketing industry was about 150. Today, that same measure stands at an astonishing 7000. It’s becoming increasingly normal for marketing teams to employ upwards of 30, or even 40 different pieces of MarTech products. But digital transformation isn’t just about getting your hands on the hottest new tech toy. Now, Marketers have to choose between all-encompassing platforms like SalesForce and specialised best-in-class solutions for each use-case. The key challenge here is to centralise customer data and orchestrate these platforms to deliver a personalised customer experience. 

Challenge #2 - Customer centricity

It's no secret that as an industry, marketing has been progressing towards customer-centricity. Now more than ever, a firm’s customer experience signals its competitiveness in the market. Again, at the root of this change is digitalisation and technology. Digital customers are in control because your competition is now a single click away from you. Accordingly, identifying and employing the appropriate marketing channels — and distributing relevant content within those channels becomes a key challenge. 

Challenge #3 - Revenue accountability

A 2019 report by Duke University found that 80% of CMOs are under pressure to deliver ROI, revenue, and growth. However, only about a third provide any financial reports as a result of technological inaccessibility and an overall lack of training. Though we have countless programs and platforms to crunch marketing data and derive revenue metrics, they can be a little too inaccessible for marketers without analytical backgrounds to make effective use of. 

And so, we arrive at three challenges — each one based to varying extents in data, technology, and alignment  — that are driving the new definition of revenue marketing.

The new and improved Revenue Marketer 

Teams in leading B2B companies continue to transform themselves from cost centers to predictable and scalable revenue machines. Except now, they have an additional focus on digital transformation, customer-centricity, and revenue accountability. As an outcome, marketing is driving non-linear growth in a world where buyers are averse to direct sales.

Okay - so far, we’ve established our basis for the contemporary definition of revenue marketing. But let’s go even further. Not only is data, technology, and alignment fundamental in defining revenue marketing; it is essential to every capability within every pillar associated with the approach as well.

Strategy

In revenue marketing, strategy involves understanding your team’s readiness for change, aligning your company’s key business initiatives, and most importantly — forming revenue synergy with sales. While a large part of this ‘getting everyone on the same page’ process involves planning, communication, and leadership; technology is playing an increasingly important role as well. Though instinct and qualitative responses can complement strategy, data, metrics, and indicators are crucial ingredients in developing accurate customer profiles and journeys. And as all three merge across sales and marketing, teams require ecosystems that are conducive to a symbiotic, well-aligned workflow. An easily accessible analytics platform (*ahem* Factors.AI) enables sales and marketing folk to speak the same language — revenue.

//Factors.AI is an AI-powered marketing analytics platform that provides critical insights into your marketing activities, decodes customer behaviour, and empowers your marketing team to focus on real strategic decisions. In short - we do all the analytical heavy lifting for you.//

Process

The process pillar isn’t dissimilar to traditional marketing. In general, Process primarily involves campaigns and data. Accordingly, there are two aspects worth highlighting — campaign management and data management.

Campaign management involves executing, tracking, analysing, and measuring digital conversions in terms of business impact. There has been tremendous progress in the MarTech space within each of these functions. Not simply to automate the process, but to derive detailed insights as well. It’s a similar story with data management. Easy access and insight into your marketing data can make all the difference in the world. Implementing this process could be as simple as consolidating all your data under a single roof or automating any recurring analysis.

//Factors.AI enables your marketing team to consolidate and crunch marketing data from across all your sources - Google, Linkedin, Facebook, and more. Our integration process is completely code-free as well. In fact, we could have your marketing team onboarded in a single week.//

People

The people pillar consists of broad capacities involving the management of people in and outside of marketing. Stakeholder alignment, resource planning, and talent acquisition are important, but talent management in particular, is an aspect worth highlighting. A firm can employ all the data and technology in the world, but if the marketing team doesn’t have sound control over these tools, they won't be of much use at all. One solution to avoid this issue is to keep things simple.

//Factors.AI is simple by design. Our platform has been tailored to make the user experience very, very intuitive. In fact, our AI-powered analytics platform does all the work behind the scenes, so detailed insights into your data becomes as straightforward as a google search.//

A training program with a specific focus on revenue marketing tools can also go a long way in improving technical fluency and ensuring your team has a good grasp of revenue-oriented data.

Customer

As a revenue marketer, it is important to understand your customer across their entire life cycle. It’s no longer sufficient for marketers to get a customer through the door and call it a day.  Revenue marketing encourages you to keep tabs on all the touchpoints a customer goes through. Additionally, a revenue marketer aims to optimize their customer data - not only to improve campaign performance but to access valuable business insights as well. A second aspect that’s closely tied to the customer is content management. The batch and blast approach simply doesn’t make the cut anymore. It’s just as important for content to be relevant to the intended audience as it is for that content to travel through the right channels.

//Multi-touch attribution, End-to-end customer insights, and Automated analysis are but a few of the several features Factors.AI has to offer. When coupled with highly customisable campaign analytics - our platform makes for a very simple, very powerful marketing tool.//

Results 

Finally, we arrive at Results. Results to a revenue marketer involves a variety of measures associated with financial outcomes (Shocker!). But it doesn't end there. Along with delivering an impressive ROI, revenue marketers also aim to accurately forecast their revenue. In essence, they construct a marketing machine that drives repeatable, predictable, and scalable revenue. I probably sound like a broken record at this point but analysing data, utilising the right tools, and ensuring organisational alignment are crucial elements at this stage. Needless to say, sufficient training and practice won’t do any harm either.

//Factors.AI’s explain feature differentiates us from the rest of the game. Along with consolidating your data and performing automated analytics, our AI-powered platform provides actionable insights in a matter of minutes.//

Over the course of this post we’ve discussed what it means to be a Revenue Marketer today, we’ve briefly explored the six pillars associated with revenue marketing, and we’ve highlighted the value of utilising data, ensuring alignment, and employing the right tools and technologies. At the end of the day, revenue marketing is a pretty straightforward idea — A well-organised, well-equipped approach that empowers marketing teams to bring in money in a predictable, scalable manner. So as a marketer, the only question left to ask yourself is this:

“What are you going to do about revenue?

Intuition can only take us so far: Fun with Factors (Part 2)

March 9, 2021
0 min read

Continuing with our series on “Fun with Factors” (please find the first part here), we had another session on “Intuition can only take us so far”, wherein we discussed how non-intuitive concepts such as irrational numbers are very much real. Furthermore, we established the importance of grounding ideas to their bare-bones structure, lest we confuse ourselves and fall into paradoxes.

The Irrational Route

For a number to be rational is to possess the ability of being expressed in the form of a fraction -- or the well-known p-by-q (p/q). Now, just for completeness, recall that ‘p’ and ‘q’ should be integers. And ‘q’ should be non-zero.

That said, is it not easy to see that every number is rational? What’s the big deal? Wait, prepare to be challenged! You need to prove (or disprove) that the square root of 2 (i.e., √2) is a rational number. Oh, I heard you! You say √2 an "imaginary" concept with no practical existence. Smart; you took the challenge to another level! So let’s first see how √2 looks like, and how it’s very real!

Take a square piece of cloth ABCD, each side of which measures 1 m. Now cut it into two pieces along one of its diagonals (say, AC). What you get are two right-angled triangles ABC and A’DC’. Let’s take one of them -- ABC. How much do its sides measure? We know AB = 1 m and BC = 1 m; but AC = ?.

The Irrational Route

Following Pythagoras’ advice, we could compute AC = √(AB² + BC²) = √(1+1) = √2. Bingo! We have a triangular cloth with one side measuring √2 metres. But you might object! “Why √2? I used a ruler and measured it to be 1.414 m.” Are we in a fix? Not yet. Analytically, we have AC = √2, but on measuring it using a ruler, we get 1.414. One can deduce that the value of √2 is 1.414. That is a smart move because if you could prove that, you would have √2 = 1.414 = 1414/1000, a rational number indeed! Let us see.

So what sorcery is this entity called √2? Simply speaking, it’s the number whose square should be 2. So, we should expect the square of 1.414 to be 2. Alas! It turns out that 1.414² = 1.999396, a little short of 2, isn't it?

Never mind, you procure a better ruler with more precise scale markings and measure the diagonal side of the cloth (AC) to be 1.41421356237 m. But on squaring it, we get 1.41421356237² = 1.9999999999912458800169, again, short of 2.

The fact of the matter is that no matter how precisely you measure the value of √2, it’s inexpressible as a fraction. But how do I convince you of that? You should demand a proof. A proof that √2 is not a rational number.

Let’s see what we could do:

Assume √2 to be a rational number; and let’s give this assumption a name: "The Rational Root Assumption" (TRRA). Now, if TRRA were to be true, we should be able to find two integers p and q such that √2 = p / q. In addition, let us demand p and q to meet a condition: that they have no common factors except 1. Let us call this the “no common factors” condition (NCFC). Now, “√2 = p/q” simply means that p = q√2, or p² = 2q². As soon as you multiply something by 2, the product becomes an even number. So we have 2q² to be an even number, and hence p² is an even number as well. This leads to our first conclusion: that p is an even number (because if it were not, then it would be odd, and if it were odd, then p would be equal to 2k+1 for some integer k, and this would mean (2k+1)² = 4k²+4k+1 = 2(2k²+2k) + 1 would be odd, and so would p² be, which is not possible since we showed p² is even). Let’s call it the “p is an even number” conclusion (PENC). But what does PENC mean? That p could be written as 2m for some suitable integer m. Let’s replace this in the equation p² = 2q². We get (2m)² = 2q², or 4m² = 2q² or q² = 2m². Oh, we have seen this before. This means q² is even, and hence q is even (for reasons made clear above). Let us call this the “q is an even number” conclusion (QENC).

The summary of the foregoing discussion is this: [TRRA and NCFC] implies [PENC and QENC]. In other words, if √2 is a rational number with numerator p and denominator q, and p & q have no common factors, then both p and q are even numbers. Wow, isn't that hard to believe, because how could p and q be even and not have any common factors? If they are even, they would have 2 as a common factor. Now, this is what we call a contradiction! And since the logical flow was flawless, there is only one explanation to the contradiction: the TRRA assumption -- that √2 is rational. Hence, we have proved that √2 is irrational. Period!

Was this discussion easy to follow? Yes.

Was it easy to write? No, because we had used wholesome English words to express the proof.

In fact, proofs are best expressed using shorthand symbols. To illustrate, the following would be a shorter version of the same argument:

To prove √2 ∉ .

Proof: Assume √2 ∈ .

⇒ ∃ p, q ∈ with p⊥q and q ≠ 0 s.t. √2 = p/q.

⇒ p² = 2q² ⇒ p²|2 ⇒ p|2 ------------------> (1)

⇒ m ∈ Z s.t. p = 2m ⇒ (2m)² = 2q² ⇒ q² = 2m² ⇒ q²|2 ⇒ q|2 ------> (2)

Now from (1) and (2) above, we have p|2 and q|2.

⇒ p⊥q is not true. Hence, we have a contradiction.

So, √2 ∉ . Hence, proved.

So √2, after all, is an irrational number and hence could not be written as a fraction of two integers.

Impossible Probabilities

To find the probability of an event is to measure something. And the prerequisite to make measurement possible is to define what to measure. Imagine what happens if what you want to measure is not well-defined. When asked to compute the conversion ratio of a campaign, your first question is to seek what the definition of a conversion event is. Let us understand the importance of defining concepts explicitly and clearly with the following example from the book on Probability and Statistics by Vijay K. Rohatgi et al, referred to as one of Bertrand’s paradoxes.

Question: A chord is drawn at random in the unit circle. What is the probability that the chord is longer than the side of the equilateral triangle inscribed in the circle? 

To understand the question more clearly, consider the circle as follows.

A chord is drawn at random in the unit circle

We have a circle (in red) centered at C with radius r = 1. Inscribe into it an equilateral triangle PQR (blue). If we now randomly draw a chord on this circle (call it chord AB), what is the probability that it is longer than the side (say s = PQ = QR = RS) of the triangle PQR?

Do you see any problem in the question formulation? If no, then you might be surprised to know that there are at least three solutions depending on how one defines the concept “a chord at random”.

Solution 1: Every chord on the circle could be uniquely defined by its end-points. Let us fix one of the end-points -- A -- on the circumference of the circle. This also defines a unique inscribed equilateral triangle APQ. The choice of the other end-point (B) dictates the length of the chord AB.

If B lies on the arc between A and P (Case 1 below), we get a chord shorter than the side of the triangle. Similar is the case when B is chosen on the circumference of the circle between A and Q (Case 2 below). But when we choose B to be somewhere on arc PQ (Case 3), we get a longer chord. 

Solution  for  A chord is drawn at random in the unit circle

Hence, we have the favourable points that could act as B (i.e., in a way that AB is longer) to be points on the circumference between points P and Q (Case 3). Now, since points A, P, and Q divide the circumference of the circle into three equal arcs AP, PQ, and AQ. We have length(arc AP) = length(arc PQ) = length(arc AQ) = 2𝜋/3. Hence, we get the desired probability as length(arc PQ) / circumference = (2𝜋/3) / 2𝜋 = 1/3.

Solution 2: Another way in which the length of a random chord is uniquely determined is by the distance of the chord’s midpoint from the circle’s centre. If we fix a radius OC, we would have an equilateral triangle PQR cutting OC at S. Moreover, length(OS) = length(SC) = length(OC) / 2 = 0.5. Our problem could be solved by picking a point X on OC and drawing a perpendicular line AXB as a chord.

Solution2  for  A chord is drawn at random in the unit circle

Now, where that X is picked decides how long the chord would be. If X is picked on line SC, we have a shorter chord; and the same done on line OS gives a longer one. So our favourable region to pick X is line OS. In other words, the desired probability would be length(OS) / length(OC) = 0.5 / 1 = 1/2.

In conclusion, we have that the same question has two solutions -- 1/3 and 1/2 -- based on our interpretation of the concept of a “random chord”. If you refer to the book, there is another solution that gives a probability of 1/4. This shows how important the exercise of “defining” a concept could be.
At Factors, we support the philosophy of crunching numbers (rather than intuition) to provide intelligent marketing insights, which are only a click away for you to experience: click here to schedule a demo with us. To read more such articles, visit our blog, follow us on LinkedIn, or read more about us.

Intuition can only take us so far: Fun with Factors (Part 1)

Analytics
January 25, 2021
0 min read

“Trust your intuition; it never lies.”, a saying most of us have heard and might strongly agree with. But at Factors this week, things were quite different when we had a session on “Intuition can only take us so far”. The idea was to relook at known concepts -- concepts we use more often than not -- and reimagine their implications from different perspectives. This article is an account of the one-hour discussion. We associate the word “factors” with different concepts at different times. Here, we associate it with maths!

Mathematics: Sturdy yet fragile

We started with the following story from “How Mathematicians Think” by Willian Byers:

A mathematician is flying non-stop from Edmonton to Frankfurt with Air Transat. The scheduled flying time is nine hours. Sometime after taking off, the pilot announces that one engine had to be turned off due to mechanical failure: "Don't worry -- we're safe. The only noticeable effect this will have for us is that our total flying time will be ten hours instead of nine." A few hours into the flight, the pilot informs the passengers that another engine had to be turned off due to mechanical failure: "But don't worry -- we're still safe. Only our flying time will go up to twelve hours." Sometime later, a third engine fails and has to be turned off. But the pilot reassures the passengers: "Don't worry -- even with one engine, we're still perfectly safe. It just means that it will take sixteen hours total for this plane to arrive in Frankfurt." The mathematician remarks to his fellow passengers: "If the last engine breaks down, too, then we'll be in the air for twenty-four hours altogether!"

Well, from basic math knowledge, you might find the next number in the sequence 9, 10, 12, 16 to be 24. Here’s how you find it. The first four numbers could be broken down as follows:

9 = 9

10 = 9+2⁰

12 = 9+2⁰+2¹

16 = 9+2⁰+2¹+2²

Pretty clearly, the next number in the sequence has to be 9+2⁰+2¹+2²+2³ = 24.

But does that mean the plane will stay in the air for 24 hours? No. It has only four engines. And if the last one breaks down too, the pilots would either perform an emergency landing or, in the unfortunate case, it would lead to a fatal crash. This shows both the strength and the fragility of maths. While in the first four cases, we could accurately measure how long the journey would take, as soon as the conditions are changed (i.e., gliding into the air instead of being thrusted by engines), the dynamics of motion change too.

Intuition could misdirect

Following is an example the “professor of professors”, Prof. Vittal Rao had given in one of his talks: Imagine you have some identical coins you are supposed to distribute among some identical people. How would you do that? Or more mathematically: In how many different ways P(n) can you distribute n identical coins to any number of identical people? Let us understand the problem by taking cases:

n = 1

  • The only way to do that is to give it to a single person:  o.  Hence, P(1) = 1.

n = 2

Distribute 2 coins. Here are two different ways:

  • You either give both coins to one person:  oo
  • Or you take two people and hand them a coin each:  o|o

Hence, P(2) = 2.

n = 3

Distribute 3 coins. What do you think P(3) should be? If P(1) = 1, P(2) = 2, we could expect P(3) to be 3, right? Let’s see.

  • ooo
  • oo|o
  • o|o|o

And 3 it is! Hence, P(3) = 3.

n = 4

Now this drives our intuition even further. The sequence we have seen until now has been 1, 2, 3. So it’s natural to assume P(4) to be 4. Let us enumerate all cases again.

  • oooo
  • ooo|o
  • oo|oo
  • oo|o|o
  • o|o|o|o

We have 5 ways to distribute 4 coins -- this beats our intuition. We get P(4) = 5.

n = 5

With new information in hand (i.e., the sequence being 1, 2, 3, 5), we could update our intuition and say this matches the Fibonacci sequence, and expects it to follow 1, 2, 3, 5, 8, 13, ... Let’s see what happens with 5 coins in hand:

  • ooooo
  • oooo|o
  • ooo|oo
  • ooo|o|o
  • oo|oo|o
  • oo|o|o|o
  • o|o|o|o|o

We get P(5) = 7 (not 8 as we had expected).

n = 6

Now what? We could now turn to a different logic: They are either odd numbers (barring the extra ‘2’) following 1, 2, 3, 5, 7, 9, 11, …,  or prime numbers (barring the extra ‘1’) following 1, 2, 3, 5, 7, 11, 13, ..., giving P(6) to be either 9 or 11 respectively. Taking n = 6, we have:

  • oooooo
  • ooooo|o
  • oooo|oo
  • ooo|ooo
  • oooo|o|o
  • ooo|oo|o
  • oo|oo|oo
  • ooo|o|o|o
  • oo|o|o|oo
  • oo|o|o|o|o
  • o|o|o|o|o|o

That’s 11 ways! The prime-number logic worked.

n = 7

Going by the same logic, we would expect P(7) to be 13 (the next prime number). Now, if you would go on and calculate it, we would have P(7) to be, in fact, equal to 15 (please go ahead and enumerate them).

In fact, it turns out that the sequence P(n) expands as follows: 1, 2, 3, 5, 7, 11, 15, 22, 30, 42, 56, 77, 101, 135, 176, 231, 297, 385, 490, etc. You could take a moment and think about it intuitively, but chances are rare that you would come up with the following formula:

approximating P(n), where we have:

The foregoing formula was derived by the well-renowned mathematician Srinivasa Ramanujan (along with G. H. Hardy). This illustrates the fact that intuition could take us only so close to the solution, and formal maths might have to be invoked in some cases.

At Factors, we support the philosophy of crunching numbers (rather than intuition) to provide intelligent marketing insights, which are only a demo away for you to experience. To read more such articles, visit our blog, follow us on LinkedIn, or read more about us.

Find the next article in this series here.

What's next in Big Data and Analytics? (Part 2)

Analytics
August 12, 2020
0 min read

In the previous blog, we very briefly went over the history of Big Data Technologies. We saw how databases evolved from relational databases to NoSQL databases like Bigtable, Cassandra, DynamoDB etc with the rise of internet along with development of technologies like GFS, MapReduce etc for distributed file storage and computation. These technologies were first developed by companies like Google, Amazon etc and later picked up in a big way by the open source community.

Big data technologies

Big Data and Enterprises

Soon enough commercial versions of these open source technologies were being distributed by companies like Cloudera, Hortonworks etc. Traditional enterprises started adopting these technologies for their analytics and reporting needs.

Prior to this enterprises built data warehouses which were actually large relational databases. It involved combining data from multiple databases of ERP, CRM etc and build an unified and relatively denormalized database. Designing the data warehouse was complex and required careful thought. Data was updated periodically. Updation involved a three stage process of extracting data from various sources, combining and transforming these to the denormalized format and loading it into the data warehouse. This came to known as ETL (Extract, Transform and Load).

With adoption of Hadoop, enterprises could now just periodically dump all their data into a cluster of machines and run ad-hoc run map reduces to pull out any report of interest. Visualization tools like Tableau, PowerBI, Qlik etc could connect directly to this ecosystem, making it seamless to plot graphs from a simple interface, but actually done by crunching large volumes of data in the background.

Customer Centric View of Data

Databases are a final system of record and analytics on databases only gives information on the current state of customers and not how they reached here.  With the rise of internet a lot of businesses are now online, or have multiple digital touchpoints with customers. Now it's easier to instrument and collect customer data as a series of actions, be it clickstream or online transactions. This customer centric model of data enables richer analytics and insights. Additionally the data is incremental, and can be made available immediately in reports, instead of being updated only periodically. More enterprises are moving to this model and datastores and technologies that cater specifically to these kind of use cases are actively being developed like TimescaleDB, Druid, Snowplow etc.

So what’s next?

To summarize, the bulk of the big data revolution, that has happened in the last 15 years, is to build systems capable of storing and querying large amounts of data. The queries are raw i.e if X and Y are variables in the data and x1 and y2 are two corresponding values of interest, then the system can return all data points where in the variable X matches x1 and Y matches y2. Or some post processed result on all the matching data points. Along the way, we also have systems that can compute on large amounts of data in a distributed fashion.

So what’s next in analytics from here? Is it building machine learning models? Certainly, the availability of all these data, enables organizations to build predictive models for specific use cases. In fact, the recent surge of interest in machine learning has actually been because of the better results we get by running the old ML algorithms at larger scale in a distributed way. While most ML techniques can be used to build offline models to power predictive features, it is not useful in the context of online or interactive analytics. Most techniques are particularly designed for high dimensional unstructured data like language or images, where the challenge is not only to build models that fit well on seen data points, but also generalizes well to hitherto unseen data points.

Datastores that make sense of data

The next logical step would be datastores and systems that can make sense of data. Making sense of data would mean that instead of blindly pulling out data points such that variable X is x1 and Y to y2, it should also be able to interactively answer different class of queries like

  • Give the best value for variable Y,  that maximizes the chance that X is x1.
  • Find all the variables or combination of variables, that influence X most when X is x1.

Such a system would continuously build a complete statistical or probabilistic model as and when data gets added or updated. Models would be descriptive and queryable. The time taken to infer or answer the different class of queries should also be tractable.  But just like there are a host of databases each tuned differently for

  • Data Model
  • Scale
  • Read and Write Latencies
  • Transaction guarantees
  • Consistency, etc

We could possibly have different systems here tuned for

  • Assumptions on Data Model
  • Accuracy
  • Ability to Generalize
  • Scale of the data
  • Size of the models
  • Time taken to evaluate different types of queries.

Autometa - is one such, first of it’s kind, system that we are building at factors.ai. It continuously makes sense of customer data to reduce the work involved in inferring from data. Drop in a mail to hello@factors.ai to know more or to give it a try.

Big Data and Analytics - What's next? (Part 1)

Analytics
August 6, 2020
0 min read

Apache Hadoop, Hive, Map reduce, TensorFlow etc. These and a lot of similar tems come to mind when some one says Big Data and Analytics.  It can mean a lot of things, but in this blog we will restrict it to the context of - analytics done on relatively structured data, collected by enterprises to improve the product or business.

When I started my career as an engineer in Google around a decade back, I was introduced for the first time to MapReduce, Bigtable etc in my first week itself. These were completely unheard of outside and seemed like technologies accessible and useful to only a select few in big companies. Yet, within a few years, there were small shops and training institutes springing up to teach Big Data and Hadoop, even in the most inaccessible lanes of Bangalore.

It’s important to understand how these technologies evolved or rather exploded, before we dwell upon the next logical step.

Dawn of time

Since the dawn of time (or rather the unix timestamp), the world was ruled by Relational Databases. Relational Databases are something that most engineers are familiar with. Data is divided into (or normalized) into logical structures called tables. But these tables are not completely independent and related to each other using foreign keys. Foreign keys are data entries that are common across tables.

Take the example of data from a retail store.  The database could have 3 tables, one for the Products it sells, one for Customers of the store and one for Orders of the products bought in the store. Each entity can have multiple attributes and is stored in different columns of the corresponding table. Each data point is stored as rows in the table. The Orders table contains entries of products bought by different customers and hence related to both Products and Customers table, using the columns product_id and customer_id.

1 index

Few implications of this structure are

  • Since each data unit is split across tables, most updates would involve updating multiple tables at once. Hence transaction guarantees are important here, wherein you either update all the tables or none at all.
  • Data can be fetched almost any way you want. For example, we can fetch all orders bought by a specific customer or all customers who bought a specific product. Additional indices can be defined on columns to speed up retrieval. But since data is split across tables, it sometimes could involve costly joins when matching the related items across tables.

SQL (Structured Query Language) became the de facto standard to query these databases and thus SQL databases also became the namesake for relational databases. These served the needs of all enterprises. As the data grew, people moved to bigger and better database servers.

Rise of Internet

Then in the 90’s there was the internet. One of the limitations of the SQL database is that it needs to reside in one machine, to provide the transactional guarantees and to maintain relationships. Companies like Google and Amazon that were operating at internet scale realized that SQL could no longer scale to their needs. Further, the data model did not need to maintain complex relationships.

If you were to store and retrieve the data unit as a whole, rather in parts across tables then each data unit is self contained and independent of other data. The data can now be distributed to different machines, since there are no relationships to maintain across machines.

Google for instance wanted to store and retrieve the information about a webpage only by it’s url and Amazon product information by product_id. Google published a paper on Bigtable in 2006 and Amazon on DynamoDB in 2007, of their inhouse built distributed databases. While DynamoDB stored data as key value pairs, Bigtable stored data by dividing data into row and columns. Lookups can be done by row key in both databases, but in Bigtable only the data in the same column family were co-located and could be accessed together. Given a list of rows and columns of interest, only those machines which held the data were queried and scanned.

2 index

Now you no longer needed bigger and better machines to scale. So the mantra changed from bigger and super machines, to cheap or commodity hardware with excellent software. And since hardware was assumed to be unreliable, the same data had to be replicated and served from multiple machines to avoid loss of data.

Open source projects soon followed suit. Based on different tradeoffs of read and write latencies, assumptions in the data model and flexibility when retrieving data we now have plethora of distributed databases to choose from. HBase, MongoDB, Cassandra to name a few. Since these databases were not relational or SQL they came to be known as NoSQL databases.

Related Big Data Technologies

This fundamental change in databases also came with auxiliary changes on how data was stored and used for computation. Most data is stored on files. But now, these files should be accessible from any of the machine. These files could also grow to be very large. And files should not be lost when a machine goes down.

Google solved it by breaking files into chunks of almost equal sizes and distributing and replicating these chunks across machines. Files were accessible within a single namespace. A paper on this distributed file system called GFS was published way back in 2003. Bigtable was infact built on top of GFS.

Distributed databases allowed you to access data only in one way (or a couple of ways) using keys. It was not possible to access data based on the values present inside the data units. In SQL you can create index on any column and access data based on the values in it. Take the example of Google storing web pages, you could access information about a webpage using url cnn.com (row key). Or you could get the links in a given webpage using rowkey (cnn.com) and a column key (links). But how do you get urls of web pages that contain the word say “Captain Marvel”.

So if the data needed to be accessed in a different way, it had to be transformed, such that data units that are related to each other by the values it holds come together. The technology used to do that was Map-Reduce. It had two phases - First it loads the data in chunks into different machines. All the urls of pages that contain the word “Captain Marvel” are sent to other process called Reducer, which collects and outputs all the matched urls. It usually requires pipelines of map reduces for more complex data transformation and joining data across different sources. This MapReduce framework was generic enough to perform various distributed computation tasks and became the de facto standard for distributed computing. The paper on MapReduce was published by Google in 2004.

3 index

Yahoo, soon took cue and developed and open sourced these technologies, which we all know as Hadoop, later adopted by Apache.  Now if Map-Reduces can be used to transform data, it could also be used to retrieve data that match a query.  Technologies like Apache Hive, Dremel, BigQuery etc were developed, which allowed user to fire SQL queries on large amounts of structured data, but the results were actually delivered by running Map Reduces in the background. An alternative to loading data into a different machine and then compute on top of it, is to take computation closer to where the data reside. Frameworks like Apache Spark, were developed broadly on this philosophy.

In the next blog, we will see some of the current trends of these technologies and discuss on how we think the these will evolve.

FactorsAI + Segment: Easy and instant analytics to drive growth

Product
August 6, 2020
0 min read

We are excited to announce our integration with Segment, further enabling companies to easily instrument user interactions across platforms and push different types of customer data, from any 3rd party source in realtime to FactorsAI.

FactorsAI + Segment Integration

FactorsAI provides advanced and intuitive analytics for marketers and product managers, to help drive growth. With FactorsAI you get immediate insights to optimize marketing campaigns, improve conversions and understand user behaviours that drive feature adoption and retention.

A good analytics setup requires detailed tracking of user actions like page views, Signups, AddedToCart with different attributes. The quality of insights on user behaviour shown by FactorsAI is dependent on the level of detail in tracking. With Segment integration this is a one time setup and you could send the same events to other tools for marketing automation, CRM etc.

Further with Segment integration, you can send data from different data sources like email, livechat which will send events like Email Delivered, Email Clicked, Live Chat Started etc. These additional events are useful when analyzing user conversions and by using Segment it can be done without the need to write custom code to hit our API’s.

Segment can perform all data collection tasks for FactorsAI. It can capture all the data that FactorsAI needs and sends it directly to FactorsAI in the right format, all in real-time. So, if you are on segment, you can now start getting insights on how to grow your customer base in no time.

To integrate with Segment, follow the steps here. Happy Analyzing!

See Factors in action

Schedule a personalized demo or get started for free

Let's chat! When's a good time?