Welcome to insideBIGDATA’s “Heard on the Street” round-up column! In this regular feature, we highlight thought-leadership commentaries from members of the big data ecosystem. Each edition covers the trends of the day with compelling perspectives that can provide important insights to give you a competitive advantage in the marketplace. We invite submissions with a focus on our favored technology topics areas: big data, data science, machine learning, AI and deep learning. Enjoy!
Morph mountains of data into value with a product-centric strategy. Commentary by Traci Gusher, EY Americas Data and Analytics Leader
According to EY’s 2022 Tech Horizon Survey, over half (52%) of U.S. C-suite and senior business leaders said their biggest technology investment will be in data and analytics over the next two years. This is not surprising as today’s business leaders realize they have access to internal and external data that is growing exponentially. When managed correctly, data can provide meaningful insights, growth opportunities and ultimately tangible value. All too often, however, organizations are disappointed with the value generated from their data; with fault aligned to poor quality, data siloed in disparate applications and insufficient ownership. To derive maximum insights and benefit from data, companies should think about it like a product. This requires support and investment from the C-Suite and senior executive leaders whose entire remit is driving value from data. Just like any other product management process, it is imperative to align product managers from the business that deeply understand data, understand who needs it, why they need it, and how they will realize value from it in the long-term, along with ultimate responsibility for a data product roadmap. Equally important are product owners focused on delivering on the product roadmap to enable the intended value. Key to this approach is a clear partnership across the business and technology teams with a shared product mindset. Let’s put this ‘data as a product’ concept into perspective. For example, if a CMO wants to understand how to segment customers more effectively, traditionally they need to request or pull data from several disparate places to get to needed external, financial, ecommerce and CRM data. Instead of pulling from all those sources ad hoc, a customer data product could be developed and managed so that data sets needed for common customer related analytics are easier to access and utilize for insights. With these types of data products, the CMO can now more efficiently access high quality and varied data that is regularly updated for new, relevant data that can be trusted for segmentation, predictive modeling, recommender systems and even for external commercialization of data.
Small data. Commentary by Satya Samudrala, Data Scientist, Digitate
With all the focus on Big Data, we’ve forgotten that many valuable observations about an organization are often quite small, in the range of megabytes or even mere kilobytes. This “small data” is often defined as data that is small enough to be processed by a single machine or understandable by a single individual. We see small data around us every day. It could be the contacts list in our phone, our calendar, or our monthly bank statement. Even in enterprise IT systems, small data is quite prevalent. Small data is particularly valuable for empowering AI. The lack of sufficient data has been the single biggest challenge in industry’s adoption of AI to date. Many organizations need AI solutions for targeted problems where information is limited (for example, predicting rare phenomena or anomalous behavior). In these cases, small data provides answers that Big Data doesn’t. When properly analyzed, small data will prove to be a goldmine that significantly expands how businesses can use AI.
Infrastructure-as-code is becoming obsolete amid low code/no-code technologies. Commentary by Venkat Thiruvengadam, Founder and CEO, DuploCloud
When it was first introduced, infrastructure-as-code (IaC) led to a fundamental shift in the way software engineers and Ops thought about the provisioning and maintenance of infrastructure. While IaC has gained wider adoption among DevOps teams, the complexities of data center configuration and management continue to create problems. Small changes in IaC can ripple through thousands of lines of code and it is harder to maintain IaC than it is to write the first time. In my opinion, IaC is rapidly becoming a last-decade technique and new emerging technologies can help solve and streamline many of the issues IaC presents. Developers want to focus on applications – not infrastructure. That is why technologies such as low code/no code and code automation are the future. They provide developer self-service with guardrails. After all, the best way to build and manage a secure infrastructure with agility is to add a layer of orchestration on top and use that to provision and change infrastructure resources. This layer needs to understand the high-level intent and required policies.
Use AI to remove data’s muddling middleman: You. Commentary by Christian Lawaetz Halvorsen, CTO & Co-founder of Valuer
In our personal lives, finding the answers to the most mundane or complex questions are usually just a couple of taps away. We live in an age where information is both a blessing and a curse; whereby the immeasurable value is often negated by the sheer abundance most of which is worthless. As a result, intelligent solutions are required to process and provide valuable insights. Yet in the professional world, where the stakes are much higher; the scope, speed and accuracy of unreliable data runs the risk of limiting and confusing the stakeholders, resulting in traditional experienced decision making taking precedence. One of the best ways to overcome such a hurdle, is to automate the processing of data. Popular search engines have long cemented themselves as foundations within personal lives for a reason, and with automated decision engines at its core, guiding users to make objective and validated choices. Echoing this notion, Nik Storonsky’s, Revolut’s CEO, recently announced his plans to create an AI driven VC fund that once created will help avoid the pitfalls of traditional data analysis, such as following the crowd, experience and personal bias. By using AI to process data, Strononsky’s direction reflects what for many represents the inevitable future of data, where all professional spheres will have AI driven decision engines supporting and guiding their fast paced industries.
Perspectives on the metrics store. Commentary by Luke Han, CEO, Kyligence
Many enterprises are on a journey to enable self-service analytics to empower all of their users to use data and analytics for organizational benefits. An increasing number of enterprises use technologies such as cloud data lakes and cloud data warehouses to boost their digital transformation. However, technical professionals struggle with consolidating business definitions in one place to provide a single source of truth that is trustworthy, understandable, discoverable, and cost-effective. One solution to these challenges is the metrics store. A metrics store is a middle layer between upstream data warehouses/data sources and downstream business applications. Metrics store decouples metrics definition from the BI reporting and data warehouses. And the teams who own the metrics can define their metrics one time in the metrics store, forming that single source of truth, and consistently reuse the metrics across BI, automation tools, business workflows, or even advanced analytics.
How AI And NLP will make keyword searches a thing of the past for financial research. Commentary by Sebastian Okser, CTO, Cyndx
When it comes to research, financial investors often rely on traditional business brokers, search engines like Google, or legacy datasets. However, a typical Google search is not designed to surface details like a company’s financials or the competitive landscape of a specific company. Further, Google certainly isn’t built towards filtering and ranking by such details. In addition, legacy financial datasets tend to be very limited in terms of the total number of companies that they cover, the depth of information associated with any one company, and their ability to serve up meaningful insights or associations between companies. AI and NLP are changing this. AI now enables us to frame large language models that read and comprehend website content, academic articles, social media and more to understand how information is connected. Combined with machine learning we can create ensembles – not just something off the shelf – to combine and build up profiles that deliver, often previously overlooked, opportunities and rank them in a way that becomes relevant and delivers completely new concepts. In essence, AI and NLP are giving us the tools to dig deeper. Synthesizing this data makes it far more meaningful than any keyword search ever could. This matters because as more people rely on these technologies, an increasing number of startups will be more efficiently connected with the right capital providers, helping to fuel VC investment and get more cutting-edge ideas off the ground. And if you ask me, this approach won’t stop at the private markets. The valuable insights surfaced by the application of AI and NLP will reach far beyond finance, and in my opinion, will change the way we search forever.
Actionable steps to creating a holistic approach to data management. Commentary by Adi Paz, CEO, GigaSpaces
Embracing new digital technologies to increase technological agility doesn’t have to be as daunting of a task as it seems. The key to successful digital transformation is marrying current strategies with a modern data platform that can quickly and securely make information available where and when it is needed. No need to start from scratch and reinvent the wheel. A few principles to keep in mind while modernizing your approach to data management are: 1) Adopt an iterative approach. Factor in industry and workload attributes when integrating new and existing environments while applying a gradual and adaptive approach. 2) Assess your portfolio and build your roadmap. Starting with talent, take stock of the existing team and resources to determine any gaps. If a mainframe is part of your architecture, consider how it can fit into a hybrid cloud ecosystem. 3) Leverage multiple modernization strategies. Be brave and try many strategies. Whether it is adopting new APIs, developing cloud-native applications, trying containerized applications or the latest DevOps techniques – IT teams won’t know which combination best fits their needs until they try.
Perspectives on Cloudera. Commentary by Andrew Brust, Bluebadge Insights
For a while now, I have felt that Cloudera has been too modest, too deferential and too averse to confrontation with competitors. An overall theme is that the competitors, and especially the cloud providers, offer data and analytics solutions that rely on a complex collection of fragmented services, leaving it to the customer to sort out their integration. Cloudera offers a true platform, covering the full data lifecycle, with components that work together cooperatively. CDP offers depth, breadth and, most important, integration between CDP components that the cloud providers can’t hope to match. In terms of unified management and control, SDX focuses on this, with great success. Azure and Google Cloud, meanwhile, try to address this with Arc and Anthos, technologies that are not comprehensive, working with only certain data services, with others to come. The cloud hyperscalers are providing jalopies — vehicles of mismatched components, slapped together in haphazard fashion — while Cloudera is offering an elegant, a well-engineered vehicle, which we could liken to a luxury sedan, a sportscar or both. It would be great to have a brainstorming session where we worked through some messaging on this, so that Cloudera could flip its competitors, and pin them to the mat. Complexity is the enemy. Cloudera offers orchestrated, elegant, turnkey integration of data technologies, across the data lifecycle, that combat and extinguish that complexity. We have to tell that story and make it clear that the competition is failing where Cloudera is prevailing.
Natural Language Processing will play a huge role in operational efficiency moving forward. Commentary by Teresa O’Neill, Director of Natural Language Solutions at iMerit
The use cases for Natural Language Processing (NLP) transcend industries. Whether NLP is being used to extract insights from meeting recordings and voice calls in the conversational intelligence space, distill information in medical documents, such as patient records, in the healthcare sector, or identify harmful speech behavior in the gaming industry, the technology is enhancing operational efficiency and improving critical processes across the board. There’s a common misconception that NLP is poised to steal human workers’ jobs, a growing concern especially amid the current economic climate. However, that’s not the case — NLP is much more complimentary to humans than people realize; it enables human workers to focus on more fulfilling and nuanced aspects of their jobs, while automating the mundane. In the next five years, we can expect NLP to be integrated into the fabric of every company’s daily operations.
Data Abstraction Is the Key to Multi Cloud Adoption. Commentary by Adit Madan, Director of Product Management at Alluxio
Most of our customers currently use or plan to use more than one CSP. Despite striving for an ideal state of being cloud-agnostic, most organizations have not accomplished this because of heterogeneous services provided by each cloud. Data abstraction is key to application portability and cloud-agnosticity. Abstraction of storage allows migration of applications from one cloud to another without moving data itself, especially across geographic regions and ownership domains. With storage abstraction, organizations can decouple data management from elastic compute resources. Reducing the time spent on unique traits of each cloud vendor and avoiding data copies across clouds is key for the agility needed to future-proof data platforms with the flexibility to select the most suitable cloud services.
2H 2022 AI predictions. Commentary by Mike Paul, VP of Sales at Sleek Technologies
The second half of 2022 is once again set to be headlined by supply chain issues as key Chinese ports are slowly coming back online following another round of health restrictions. And with that, many shippers and logistics companies are continuing to double down on their technology investments in the hopes that they will be able to become more agile and responsive – particularly when it comes to AI. Whether it comes to finding alternate sources at the drop of a hat, or surveying the capacity market in real-time, shippers, manufacturers and logistics companies are beginning to apply AI in an increasing number of ways. Therefore, as the second half of 2022 continues to fight through various supply chain woes look for the supply chain industry to continue to become more sophisticated in its use of advanced computing technology.
Proceed into the enterprise metaverse with caution. Commentary by Ramprakash Ramamoorthy, Director of Research at ManageEngine
The metaverse is the latest fad within Big Tech’s surveillance economy and it’s bringing a host of problems related to privacy and security as organizations adopt enterprise metaverse concepts. Organization’s leveraging enterprise metaverses are expanding cyber-attack surfaces significantly because within the metaverse ecosystem, there are IoT devices and wearables from multiple vendors, as well as sensors throughout offices and homes actively processing a colossal amount of user behavior data in real time. Plus, the companies with an enterprise metaverse are using AR/VR devices that collect a ton of personally identifiable information (PII), including financial and personal data. Even more problematic, to verify users, these businesses will want to collect biometrics, including fingerprints and facial recognition. Not only does this create more data to protect from bad actors, but also allows the company access to more employee data than ever before. Further, enterprise metaverses will also be particularly ripe for social engineering attacks. It’s one thing to receive a fake request from a work colleague via email, and quite another to look at a colleague’s avatar face and process that same request. Avatars will be falsified, stolen, and weaponized more broadly by bad actors to commit fraud. This is on top of the prevalence of cryptocurrency transactions in the metaverse will make it easier to hide ill-gotten gains.
Real-world business applications for High-Quality Data. Commentary by Co-founder and CEO of MANTA, Tomas Kratky
Data quality is an age-old problem for enterprises of all sizes. With enormous swaths of data flowing between numerous complex systems, it’s easy for data quality issues to go unnoticed. However, the longer low-quality data is allowed to flow through systems without intervention, the more likely it is to cause problems such as bad business intelligence, delayed cloud migrations, inaccurate consumer insights, and more. Data integration, migration, supply chain management, marketing initiatives, and regulatory compliance are just a few business use cases for high-quality data.For example, when businesses come together through a merger or acquisition, the integration process suffers if either organization has poor quality data. As a result, data becomes more challenging to manage, and companies may be liable for failing to meet compliance standards. Separately, ensuring high-quality data when migrating away from legacy systems is critical in preventing issues like poor business intelligence and process disruptions. While data quality has historically been a pain point for businesses, it doesn’t have to be. Taking advantage of lineage tools that offer pipeline visibility enables organizations to identify where quality issues might arise and address them before they become problematic.
International Women in Engineering Day. Commentary by Shreshtha Mundra, Senior Software Engineer, Cohesity
My advice to young women in engineering, today and always, is to not second-guess yourself — just go for it. Research shows that many women do not apply for positions they’re qualified to hold, with a LinkedIn study finding that women apply to a staggering 20 percent fewer jobs than men overall. What’s more, an internal report by Hewlett Packard found that men will apply for a job when they meet about 60 percent of qualifications, while women tend to apply only if they meet almost all qualifications. Of course, that’s only one part of the story. Within both the industry and in academia, we too often see a skewed pipeline of talent that overlooks highly qualified, capable women engineers. It’s beyond time for organizations to examine their hiring processes and work culture to ensure they’re creating opportunities for women in tech. Establishing Diversity, Equity and Inclusion councils, as well as employee resource groups, can go a long way toward opening new doors while also addressing the systemic factors behind existing biases.
International Women in Engineering Day. Commentary by Nathalie Tikwa, Engineering Manager, Nitro
The most significant thing I’ve learned in my career is first, to learn how to work with a diverse set of people takes openness and patience. Secondly, for each project or task find the most optimal way to communicate within a team to get things done and with the stakeholders to accomplish strategic alignment. Finally, be aware of your team structure and ensure communication flows naturally – leaving minimum space for interpretation gaps. My advice for someone looking to build a career in engineering is to jump straight into it. There are very few topics within the field that I have not been able to find high quality, open-source learning material and courses. It is an unlimited playing field, and nothing stops you from getting started on the topics you wish to explore.
International Women in Engineering Day. Commentary by Kiara Oliver, Software Engineer, Tamr
As a software engineer, I have a passion for technology and problem solving. Although I fix bugs, do code updates and troubleshoot daily, I hope to learn more about design so I am able to create my own code from the ground up. I really want to be able to improve the efficiency of my code.
International Women in Engineering Day. Commentary by Marie Donovan, Quality Engineering Lead, Tamr
My advice for women trying to break into engineering is to find something that truly interests you first. Then, look for a place that will make space for you to do that work but also gives you the flexibility to move between roles and try new things. Eventually, you may find something new that you want to try and having a place that supports your interests and mobility will be key to achieving that.
Sign up for the free insideBIGDATA newsletter.
Join us on Twitter: @InsideBigData1 – https://twitter.com/InsideBigData1