Big Data Quarterly Summer 2021 Issue
Big Data Quarterly Summer 2021 Issue
NEXT-
GENERATION
DATABASES
PROVIDE an EMBARRASSMENT
of RICHES for DATA MANAGERS
25 INSIGHTS
ADVERTISING
Enabling Data Intelligence:
Stephen Faig, Business Development Manager, 908-795-3702; [email protected]
Q&A With Zaloni’s Ben Sharma
INFORMATION TODAY, INC. EXECUTIVE MANAGEMENT
Thomas H. Hogan, President and CEO Thomas Hogan Jr., Vice President,
Marketing and Business Development features
Roger R. Bilboul,
Chairman of the Board Bill Spence, Vice President,
Information Technology 4 THE VOICE OF BIG DATA
Mike Flaherty, CFO
Taking Data and Analytics to the Next Level in 2021:
BIG DATA QUARTERLY (ISBN: 2376-7383) is published quarterly (Spring, Summer, Fall, and Winter) Q&A With Radiant Advisors’ John O’Brien
by Unisphere Media, a division of Information Today, Inc.
POSTMASTER
Send all address changes to: 8 FEATURE ARTICLE | Joe McKendrick
Big Data Quarterly, 143 Old Marlton Pike, Medford, NJ 08055
Copyright 2021, Information Today, Inc. All rights reserved. Next-Generation Databases Provide an
PRINTED IN THE UNITED STATES OF AMERICA Embarrassment of Riches for Data Managers
Big Data Quarterly is a resource for IT managers and professionals providing information on the
enterprise and technology issues surrounding the ‘big data’ phenomenon and the need to better
manage and extract value from large quantities of structured, unstructured and semi-structured 24 BIG DATA BY THE NUMBERS
data. Big Data Quarterly provides in-depth articles on the expanding range of NewSQL, NoSQL,
Hadoop, and private/public/hybrid cloud technologies, as well as new capabilities for traditional Key Trends in Data Management
data management systems. Articles cover business- and technology-related topics, including
business intelligence and advanced analytics, data security and governance, data integration,
data quality and master data management, social media analytics, and data warehousing.
No part of this magazine may be reproduced and by any means—print, electronic or any other—
columns
without written permission of the publisher.
COPYRIGHT INFORMATION
27 DATA SCIENCE PLAYBOOK | Jim Scott
Authorization to photocopy items for internal or personal use, or the internal or personal use
of specific clients, is granted by Information Today, Inc., provided that the base fee of US $2.00 Log Parsing With AI: Faster and With Greater Accuracy
per page is paid directly to Copyright Clearance Center (CCC), 222 Rosewood Drive, Danvers,
MA 01923, phone 978-750-8400, fax 978-750-4744, USA. For those organizations that have
been grated a photocopy license by CCC, a separate system of payment has been arranged.
Photocopies for academic use: Persons desiring to make academic course packs with articles
28 DATA DIRECTIONS | Michael Corey & Don Sullivan
from this journal should contact the Copyright Clearance Center to request authorization through
CCC’s Academic Permissions Service (APS), subject to the conditions thereof. Same CCC address as The Coming Tsunami of Automation
above. Be sure to reference APS.
Creation of derivative works, such as informative abstracts, unless agreed to in writing by the
copyright owner, is forbidden. 30 THE DATA-ENABLED ORGANIZATION | Lindy Ryan
Acceptance of advertisement does not imply an endorsement by Big Data Quarterly. Big Data
Quarterly disclaims responsibility for the statements, either of fact or opinion, advanced by the
Enabling the Entire Organization
contributors and/or authors.
The views in this publication are those of the authors and do not necessarily reflect the views of
Information Today, Inc. (ITI) or the editors. 31 THE IoT INSIDER | Bart Schouw
A s restrictions caused by the pandemic begin are now being used to automate the task of pars-
to ease and a “new normal” for business comes ing network security logs.
into view, some organizations are emerging as A number of authors in this issue also focus
winners. Key to thriving and not just surviving, on the human considerations involved in new
many experts say, is the ability to use data more technology adoption. LicenseFortress’ Michael
effectively, enabled by the well-applied use of the Corey and VMware’s Don Sullivan weigh in on
right database technologies and cloud resources the impact of what they call the coming “tsunami
as well as automation and AI. The implications of automation.” As companies increasingly look
of these transformative technologies are exam- to automation to address the intertwined con-
ined from a range of perspectives in a variety of cerns of labor availability, profitability, efficiency,
articles in this issue of Big Data Quarterly. and speed, there is almost nothing outside of the
In his cover article exploring current data- arts and sports that will not be automated, they
base management trends, contributing editor Joe suggest. Also looking at automation and inno-
McKendrick notes that if there is one character- vation from the human perspective, Software
istic that singularly dominates the data scene AG’s Bart Schouw relates his own experience as a
now, it is the vast assortment of systems available. consumer purchasing a high-end “smart” oven
Today, he points out, there is a database for every and the problem of achieving the critical last
type of function, creating choice and opportu- mile of customer service. Connected products
nity but also requiring knowledge about the best necessitate connected CX across all channels,
database environment for the use case. he concludes. Meanwhile, Veeam’s Danny Allan
DataOps, MLOps, robotic process automation, considers what’s necessary to achieve the pro-
and low-code/no-code development are some of ductivity that DevOps promises and observes
the new approaches taking hold, according to that cultural shifts are the hardest kinds of reor-
Radiant Advisors’ John O’Brien, who shares his ganizations to pull off.
take on the top technology trends helping orga- And there are many other articles that exam-
nizations gain value from their data. There is a ine the use of new approaches and best prac-
new appreciation for being agile in terms of BI tices to improve data analytics, security, and
and data analytics since, over the course of the governance. Radiant Advisors’ Lindy Ryan spot-
past 15 months, many companies have discov- lights the challenges of data-enabling the entire
ered that what they were doing was simply not organization, SAS’ Kimberly Nevala looks at AI
fast enough, O’Brien notes. DataOps is building governance, and EPAM Systems’ Sam Rehman
on the best practices that have emerged from the underscores the importance of embedding secu-
DevOps world and is relying heavily on auto- rity by design to defend the enterprise against
mation and data governance, adds Zaloni’s Ben cyberthreats.
Sharma in a separate interview. “You must be To stay on top of the latest data trends, research,
able to use data effectively in a timely manner so and news, visit www.dbta.com/bdq, register for
that you can adapt to change,” Sharma stresses. highly informative weekly webinars at www.dbta
Adding to the discussion of where AI and .com/Webinars, and take advantage of the exten-
automation can add value, NVIDIA’s Jim Scott sive white paper library available at www.dbta.com
shares how natural language processing methods /DBTA-Downloads/WhitePapers.
Dell Technologies has Matillion, an enterprise cloud data of multi-cloud and open source
announced plans to spin off its integration platform, has introduced technology. DataStax Astra builds upon
81% equity ownership interest in Matillion ETL for Delta Lake the Apache Cassandra open source
VMware, resulting in two standalone on Databricks , enabling data database and introduces a modern,
companies. The transaction is professionals across the business to microservices-based architecture
expected to close during the aggregate and share data in a single that separates compute from storage,
environment. www.matillion.com and enabling database resources to scale
fourth quarter of calendar 2021.
https://databricks.com up and down on demand to match
www.delltechnologies.com and
application requirements and traffic,
www.vmware.com Denodo, a leader in data virtualization, independent of compute resources.
is releasing Denodo Standard, www.datastax.com
Kyndryl will be the name of the a new data integration solution
new, independent company that will available on cloud marketplaces A erospike , provider of next-
be created following the separation that leverages Denodo’s modern generation, real-time NoSQL
of IBM’s Managed Infrastructure data virtualization engine to data solutions, has released the
Services business, which is expected deliver superior performance and Aerospike Kubernetes Operator and
to occur by the end of 2021. Kyndryl productivity. www.denodo.com made advancements in Aerospike
will be headquartered in New York City. Cloud Managed Service to help
Sumo Logic has partnered with enterprises unlock cloud productivity
www.kyndryl.com and www.ibm.com
AWS for the launch of Amazon and agility with scale-out cloud
Alluxio, a developer of open source CloudWatch Metric Streams, a data. www.aerospike.com
cloud data orchestration software, has fully managed, scalable, and low-
introduced a go-to-market solution in latency service that streams Amazon HVR, an independent provider
CloudWatch metrics to partners of real-time cloud data replication
collaboration with Intel to offer an
via Amazon Kinesis Data Firehose. technology, is expanding its partnership
in-memory acceleration layer with 3rd
AWS and Sumo Logic customers with Snowflake, enabling customers
Gen Intel Xeon Scalable processors now have a fully managed solution to utilize HVR within the Snowflake
and Intel Optane persistent memory for streaming CloudWatch metrics Data Cloud through Snowflake’s
(PMem) 200 series. www.alluxio.io into Sumo Logic to help simplify the Partner Connect. www.hvr-software.
and www.intel.com monitoring and troubleshooting of com and www.snowflake.com
AWS infrastructure, services, and
MarkLogic, a provider of cloud MathWorks has introduced the
applications. www.sumologic.com
data integration and data management and https://aws.amazon.com newest release of the MATLAB and
software, has announced the general Simulink product families. Release
availability of a custom connector for Cloudera, the enterprise data 2021a (R2021a) offers hundreds
AWS Glue, a fully managed, serverless cloud company, announced the of new and updated features and
data integration service, to create, run, Cloudera Data Platform is functions in MATLAB and Simulink,
and monitor data integration pipelines. now available on Google Cloud, along with three new products and 12
www.marklogic.com allowing customers to get positive major updates. www.mathworks.com
business results fast with instant
access to quality data on a scalable, Elastic and Grafana Labs
Redis L abs has closed $110 million
open source, enterprise data cloud are forming a partnership to deliver
in financing led by a new investor,
platform. www.cloudera.com and the best possible experience of both
Tiger Global, bringing the company’s Elasticsearch and Grafana. Through
valuation to more than $2 billion. https://cloud.google.com
joint development of the official
The company’s Series G round also DataStax has introduced Astra Grafana Elasticsearch plugin, users
included participation from another Serverless, an open, multi-cloud can combine the benefits of Grafana’s
new investor, SoftBank Vision Fund 2, serverless DBaaS that delivers a visualization platform with the full
and existing Redis Labs investor TCV. combination of pay-as-you-go data capabilities of Elasticsearch.
https://redislabs.com together with the freedom and agility www.elastic.co and https://grafana.com
How do you feel about low code, and is this approach much technology and too many layers. In the past, there was a
becoming more relevant for analytics? security layer, a data layer, and a governance layer, and you could be
I’m a big fan of the low-code and no-code world. We’ve been in just one space. Now that we’re slicing vertically, data engineers
helping companies move to this kind of paradigm where what have to sign up to do all of that.
you have are people that are the knowledge workers, the sub-
ject-matter experts—the business SMEs—and when you give them How quickly is DataOps being embraced?
a no-code or low-code kind of tool to work with data, they are going I do believe the key is having what we call IT enablement
to be 10 times more efficient in finding, exploring, and validating organizations that are focused on how agile delivery teams are
data than a data engineer or data scientist. They have the business doing. These teams need best practices, good processes, and a
knowledge, which means that when they look at something, they methodology more than they need the technologies. The old
can say, “That’s not right” or “That’s bad data.” It is business data way of thinking is, “I’ll buy a technology; it’ll solve my problem.”
enablement, and we’ve got a whole practice dedicated to this. It doesn’t work that way. The number-one challenge in these
companies is, “How do I change culture? How do I change to
Is this tied to the metamorphosis companies went democratize the data workforce and enable self-service?” That
through in 2020? is what they need to focus on, and that’s where the challenge lies.
It’s the next incarnation of agile BI that we’ve seen for a
couple of decades. It’s just that now, companies are trying to Are there any other approaches on the horizon that you
be data-driven, which means that they are enabling everybody think are going to be helpful?
in the business to work with the data. In the technologies, the What I think will be interesting in the future is related more
low-code world is one of the biggest trends moving forward. to machine learning. We have been talking about operationalizing
And then the second one is cloud. Very simply, [it is] the abil- analytics for over 5 years. It has finally come together to become
ity for people to self-provision their own needs and resources, MLOps in order to scale. It is an inverted paradigm shift from BI.
that’s the other piece where you’re taking time out of the cycle. In BI, we build a dashboard and we’re done. In ML, you put it into
production and your work is beginning because you have to start
There is also a strong emphasis on blending what were monitoring all the time.
previously siloed processes. How are methodologies MLOps will be one of the big trends that I think will continue
such as DataOps being used now? to evolve. One of my favorite new kinds of technologies in the
A significant aspect of DataOps around continuous integration space is the automated feature engineering. We get a lot of ques-
and continuous deployment is that the team not only has to do the tions about whether all of the data from 2020 is really garbage for
engineering work and the testing, but now they are responsible for a training dataset because it was so unrealistic. Training data-
embedding governance. We have actually leaned more on DataOps sets are a big challenge. There is new technology that allows you
to embed data governance inside of every data pipeline with moni- to take AI and give it a 1,000-column dataset, and it will churn
toring and alerting notifications. through that until it gets down to the dozen or so that are the
We created an organizational strategy of enabling teams with the most relevant for you, and the AutoML piece allows you to look
goal of making these agile delivery teams better at deploying their at the high predictors.
own code. If, organizationally, you have this team in an enablement,
supportive role, they will make sure that those agile delivery teams Is there anything else?
have everything they need to do a good job, which is embed gov- The other major trend for 2021 is going to be robotic process
ernance—embed data quality, auditing, and proper security. The automation (RPA). We expect RPA to come in and make a big
challenge that we have heard from some clients in the field further hit on the self-service data analytics world. Those are probably the
down this journey is that the data engineer who likes to write and two main categories that will dominate over the next year or two.
integrate code now becomes a full-stack engineer, which means
that person needs to understand security and all these other parts. What’s next?
We are at a turning point right now as companies are going
Has it been effective? from surviving to thriving. That’s how we view it. There is a lot of
Some teams have struggled a little bit because it is a shift from pent-up demand to figure out how to do it right, which is nice. In
a data engineer to becoming a full-stack engineer. In some cases, years past, everybody wanted to just go buy something.
these teams are deploying things into containers and deploying
Kubernetes into the fabric, and they are saying that it is just too Interview conducted, edited, and condensed by Joyce Wells.
A few years ago, if you had asked a group of C-Level organizations were experiencing using physical servers.
executives to project which software delivery trend Now, containers present a lightweight alternative to
would be more important for organizations, most virtual machines, and Kubernetes provides a frame-
would have ranked DevOps ahead of containers. work for managing containers at scale.
DevOps promised to profoundly refocus and re-en-
ergize software teams while containers seemed to be Challenging Operational Complexity
an interesting new way to repackage resources that DevOps is more complicated—and more of a trig-
were already there. ger for frustration. It’s focused on restructuring teams
Today, the rankings have flipped. Container adop- and processes. It attempts to solve a process challenge
What
organizations
tion is accelerating—and the technology that orches- of operational complexity and miscommunication.
need to do is trates container usage, Kubernetes, is regularly being Rather than have developers and operations teams
be patient and described as “revolutionary.” DevOps? While it’s still working in silos on different schedules with different
understand popular—up to three-quarters of all organizations use priorities, DevOps attempts to bring them together as
that DevOps— a DevOps blueprint—more users say they’re struggling one cohesive unit.
unlike a move with integrations. One survey found that 86% of orga- The problem is, DevOps isn’t a cookie-cutter
to containers— nizations consider software delivery a top priority, but approach that works the same way with every organi-
is a journey only 10% say they’re successful at it. A dozen years after zation. Some entities are able to use DevOps to sort out
that takes time
the term was first coined, DevOps may be heading inefficiencies and get intransigent workers aligned on a
and will, by
nature, have its
for Gartner’s so-called “Trough of Disillusionment.” common goal. Others run into trouble getting workers
ups and downs. The good news is that DevOps, similar to many motivated or managing the change process. While some
technology concepts, will likely climb out of the trough organizations are able to implement DevOps quickly,
and head toward the fourth and fifth stages of Gartner’s others may find it can take time. It all depends on the
“Hype Cycle”—a “Slope of Enlightenment,” leading to culture of the organization.
a “Plateau of Productivity.” What organizations need to DevOps aims to reduce complexity—but for some
do is be patient and understand that DevOps—unlike organizations, it can have the reverse effect. Teams have
a move to containers—is a journey that takes time and to get used to new sets of rules, timetables, reporting
will, by nature, have its ups and downs. structures, and general working conditions. Workers
Kubernetes is, at its core, a game-changing technol- may need to be retrained or reassigned, which intro-
ogy. Organizations use container platforms to create duces more complexity into an already challenging
and run applications a whole new way—and Kuber- process of delivering software.
netes orchestrates the way container infrastructures Although DevOps introduces some technical
operate. Kubernetes opens up a new approach for changes, it mainly involves making a cultural shift. Cul-
delivering services—providing scalability, portability, tural shifts are the hardest kinds of reorganizations to
and better economics. It allows development teams to pull off. They require buy-in from workers at all lev-
move faster, work more cost-efficiently, move appli- els—and they usually take time to execute correctly.
cations on any cloud or on-premises platform. These DevOps also depends heavily on automation.
are tangible technical and financial benefits that won’t Essentially, the goal is to automate as many tasks as
Danny Allan diminish over time. possible to ensure that the overall system runs like a
is CTO at
Veeam (www. Kubernetes is following the same trajectory virtual- clock—moving builds from step to step, performing
veeam.com). ization did back in the 2000s. Virtualization essentially regular tests, catching flaws, and capturing data. It’s
solved many of the storage and resource constraints a lofty goal, but in practice, it’s hard to do. Even the
BY JOE M c KENDRICK
The move to next-generation databases massive amounts of data and the increasing
is driven by their ability to help companies rate of query processing by making use of
achieve competitiveness and reach custom- additional resources, high availability and
ers faster and more efficiently. These new fault tolerance to respond to client requests,
breeds of systems can be a force for business even in the case of hardware or software
transformation—whether it is generating failure or upgrade events, transaction reli-
new sources of revenue, enhancing cus- ability to support strongly consistent data,
tomer experience, or producing data-driven and database schema maintainability to
insights that improve how organizations reduce the cost of schema evolution.”
interact with customers. NoSQL systems, for example, “are used
“Advances in web technology, social not as a revolutionary replacement for the
networking, mobile devices, and Inter- relational database systems but as a remedy
net of Things have resulted in the sudden for certain types of distributed applications
explosion of structured, semi-structured, involved with a massive amount of data that
and unstructured data generated by glob- need to be highly scalable and available,”
al-scope applications,” according to a pub- said Davoudian.
lished analysis by Ali Davoudian of Carleton
University. Due to their inflexibility and the LEADING THE NEXT-GENERATION
cost and complexity of data transformation It isn’t just the NoSQL systems that are
and migration, traditional RDBMSs cannot gaining attention—open source databases
meet many of today’s digital requirements have also increasingly been implemented
alone, Davoudian and his colleagues stated. as solutions. “We’re seeing a huge growth
“Such applications have a variety of require- in technologies like Apache Cassandra and
ments from database systems, including Apache Hbase,” said Ken LaPorte, man-
horizontal scalability to linearly adapt to the ager of the data infrastructure engineering
Qlik
PAGE 17
DATA ONBOARDING:
DATA
OVERCOMING
THE CHALLENGE
Swim
PAGE 18
CONTINUOUS
INTELLIGENCE: APPS STRATEGIES
for the
THAT STAY IN SYNC
WITH THE REAL WORLD
Semarchy
REAL-TIME
PAGE 19
THE KEY ROLE OF THE
DATA HUB FOR REAL-TIME
ERA
DATA STRATEGIES
CData
PAGE 20
LOGICAL DATA
ARCHITECTURES
FOR REAL-TIME
INTEGRATION
GigaSpaces
PAGE 21
DIGITAL INTEGRATION HUB:
THE ARCHITECTURE OF
DIGITAL TRANSFORMATION
DataStax
PAGE 22
THE FUTURE OF
DATA MANAGEMENT
IN THE CLOUD
continues on page 16
architectures, AI, and the growing use of analytics. Every single To learn more about how Redis
middle-market asset management, insurance, and financial helps financial services companies
institution surveyed by BDO in 2020 said they have developed—or address the increasing demands of
are planning to develop—a digital strategy. But despite the fact modern finance, download “Data
that literally everyone is working on a digital strategy, only one- Innovation Opportunities in Banking
quarter (27%) of those institutions are executing their strategies. and Finance.”
There are four areas where data-layer technologies in particular
can help traditional banking and financial services firms overcome REAL -TIME RETAIL
the new challenges and profit from emerging opportunities. Delivering Real-Time Retail
1. Customers increasingly demand an omnichannel During the last decade, widespread access to fast, reliable
experience from their financial services providers. broadband and the evolution of innovative online services
Traditional banks that successfully implement combined to create the era of real-time retail. Consumer
such a strategy can turn their physical branches expectations of retailers were transformed by their experiences
into a competitive advantage, see improved buying from Amazon and other ecommerce companies that
recommendation rates, and encourage customers were able to provide fast, efficient shopping experiences.
to take on more products and services. In 2020, the trend was amplified even further. COVID-19
2. Regulations are requiring financial institutions lockdowns restricted shopping at brick-and-mortar stores
to share customer data through open banking in many countries, sending even more consumers online.
processes. But meeting these standards is not just Today, retailers cannot afford to offer mediocre experiences.
a cost. In the UK, which was an early adopter of Nine out of 10 U.S. consumers say they will abandon a retailer’s
open banking, the measures have been shown website if it is too slow, according to a 2020 study compiled
to unlock new revenue opportunities. by Retail Systems Research for Yottaa. Worse, almost six out
3. F inancial institutions face a growing threat from of 10 (57%) will visit a competitor’s site instead and one five
fraud and cybercrime. Data layer technologies (21%) would never return. If that’s not bad enough, one in
can help financial services companies meet these seven (14%) would vent their frustration on social media.
challenges, giving customers confidence that But faster performance for mobile apps and websites is only
their financial security is in good hands. the beginning. Retailers can make the user experience feel more
4. A vailability and scalability are vital to ensuring that responsive by showing products in stock, allowing shoppers
new and innovative services can actually be delivered to search purchase history, and supporting buying online
to customers, providing banks with the flexibility to andpicking up curbside.
meet changing conditions and build customer trust. A retailer’s omnichannel and supply chain systems must
In banking and finance, companies that want to stay also be able to scale up when required to meet increased
relevant must manage and use data in ways that benefit their demand around predictable, major events of the retailer’s
customers, enable agile business processes, and support new year such as Black Friday and Cyber Monday, as well as
products and services. special events such as the release of limited-edition items.
Redis Enterprise brings real-time performance to use Additionally, these systems must be able to scale up to meet
cases like identity verification, transaction scoring, and more. consumers’ expectations even during unpredictable surges in
Redis Enterprise can also bring the power of in-memory demand such as those caused by surprise endorsements from
processing to other components of a fraud detection system. online influencers or by unexpected external events.
The RedisGraph module enables fast graph processing that can
be used to detect synthetic fraud, and RedisAI brings real-time Supporting Instant Retail Experiences
AI model-serving to power more efficient transaction analysis. Without a world-class data layer, a retailer will struggle to
Furthermore, Redis Enterprise Cloud and tiered storage develop a genuinely effective and compelling real-time offer. The
options in Redis Enterprise offer an attractive TCO by eliminating data layer underpins key elements within a successful real-time
data center-related spending and improving IT productivity to retail proposition. It must provide a consistent real-time view of
let your organization focus on rapid innovation, rather than inventory, managing updates from stores and enterprise systems to
just keeping the lights on. Finally, Redis Enterprise provides give customers and staff a clear, accurate view of stock availability.
enterprise-grade reliability, performance, and high availability It also needs to be resilient and scalable, to satisfy consumers’
for mission-critical financial applications. It ensures five-nines expectations, and to manage periods of increased demand.
(99.999%) availability around the world with active-active geo- As an in-memory database delivering multiple data
distribution across regions, and provides an in-memory data structures with best-in-class performance, Redis Enterprise
layer that delivers sub-millisecond latency at virtually any scale. is perfectly suited to meet the demands of real-time retail. It
provides the performance needed to deliver a great shopping Leveraging Redis Enterprise’s active-active database replication,
experience, ensuring that retail applications and websites are with conflict-free resolution, allows enterprises to avoid the
always fast and responsive; and supports real-time inventory. complexity and costs of managing message brokers between their
Redis Enterprise can scale up capacity and performance in in-store and enterprise databases while at the same time ensuring
response to demand from real-time applications, with no need consistency. This eliminates the need for auditing, reconciliation,
to change application code and without incurring additional and risk of data duplication. In the event that a store becomes
costs, downtime, or disruption. By providing automated failure disconnected from the enterprise database, Redis Enterprise will
detection, failover, and cluster recovery, Redis Enterprise helps automatically reconcile once the database becomes available.
ensure that retailers can continue operating even after experiencing Enterprise operations can also leverage the replica database
bursts of traffic during seasonal peaks or to get an exact view into each store—allowing them to make
unexpected surges in demand. inventory/yield/supply-chain management decisions based
To learn more about how Redis on real-time information and send updates to the stores as
supports real-time retail, download needed. This simplifies ship-from-store functionality and
“Retail in the Era of Real-Time Everything.” change management for store-based order fulfillment, and
it helps ensure compliance with corporate promotions,
Enabling Real-Time Inventory Systems pricing, inventory levels, and so on. Finally, it improves yield
Real-time systems allow large, multi-site retailers to management by controlling local discounting behaviors.
optimize inventory, yield management, and supply-chain Just as importantly, since Redis Enterprise is a multi-model
management. Relying on historical data makes inventory database, it allows developers to choose the data structure best
forecasting less accurate, increasing costs from suited to the SLAs and data access patterns of their
carrying excess inventory and requiring application. This is one of the many reasons why
unnecessary shipping. adoption of Redis Enterprise is growing within
Retailers can also face reduced yields due microservices architectures. For example, you can
to poor execution of enterprise-wide choose to deploy your database as a key-value
pricing and promotional strategies— store, a graph database, a time-series database, a
for example, the inability to allocate cache, a streaming engine, a search engine, and/or
available inventory to the highest- a document store—and many others—with each
margin locations. And real-time database deployed on the same multi-tenant Redis
inventory is an essential component of a unified national Enterprise cluster, minimizing the complexity
order-fulfillment strategy, letting retailers pool geographically and costs of technology and vendor sprawl.
clustered store locations and warehouses to contribute to a To learn more about how Redis
single inventory. supports real-time inventory management,
Retailers without real-time inventory management risk download “Real-Time Inventory: Building
product unavailability in the face of natural events and disasters. Competitive Advantage.”
Before an anticipated event that may disrupt operations, real-
time inventory management lets companies redirect fulfillment The Redis Enterprise Real-Time Advantage
to healthy regions or proactively stock potentially impacted areas. For any company, particularly those in the most high-pressure
More importantly, a store database must remain available even market sectors undergoing rapid digital transformation, Redis
if it becomes cut off from the enterprise. This allows the retailer Enterprise offers the high availability, superior performance,
to continue operating with assurance that all of its inventory will and flexibility developers need to deliver real-time applications
automatically sync with the enterprise database—without any that meet the expectations of today’s customers. Using active-
conflicts—once connections are re-established. active geo-distributed technology, Redis Enterprise allows Redis
Real-time inventory is a critical piece of an omnichannel databases to be replicated across multiple geographic regions,
retail strategy, delivering a unified, seamless, and consistent delivering local latencies, rapid automated failover, and data
customer experience across all channels, including in-store consistency for globally distributed applications. Redis Enterprise
shopping, websites, mobile apps, email, and social media. is also available as a managed service via all three major cloud
providers, enabling developers to reduce time to market by
Optimizing Inventory, Yield, and Supply Chain Logistics quickly launching databases in the cloud.
For complex real-time inventory systems, Redis Enterprise Get started today with a free trial at https://redislabs.com/
is chosen by leading retailers because it uniquely provides the try-free.
capabilities required from a mission-critical database. Large
retailers like Staples, Gap, and many others are already reaping Redis Labs
these benefits. www.redislabs.com
Data Onboarding:
Overcoming the Challenge
Adam Mayer, Senior Technical Product Marketing Manager, Qlik
AS A BUSINESS, YOUR DATA GOALS Data virtualization is a method folding. This query folding means
initially center around understanding of building a “logical” data access that our drivers will intelligently push
the data you need and how that data layer, or logical data warehouse, that specific functionality down to the data
will be used. Once you have defined provides a unified data layer for BI source and let the data source process
your data sources and your policies systems or enterprise applications the request. Our solutions pass as
and procedures around data, your to query. Instead of consolidating much data processing as possible to
attention should move to drive data to a single repository, data the source to reduce the amount of
efficiency. Data efficiency is an end remains at the source and is client-side processing work, minimize
goal, but it is important to realize accessed on-demand in real-time. the number of API requests, and
that its execution takes many forms. At CData, our connectivity reduce the size of returned data sets.
Environmental adaptability, speed solutions simplify real-time data
of replication, ease of use, and data integration to the point where it is REAL-TIME DATA
protection are all factors that play into accessible to any user. Our driver CONNECTIVITY IN FOCUS
the efficiency of your data processes. technologies and connectivity solutions As the volume and complexity
While batch-oriented data create a logical data layer for data of enterprise data continue to grow,
architectures like ETL/ELT with data operations that makes all your data it is crucial to look beyond data
warehousing remain popular, the sources look and behave exactly like connectivity as a collection of point-
need for real-time data is increasingly a standard database to applications. to-point integrations. To harness data’s
critical. From analytics and decision strategic value, it needs to be available
support to AI and ML, to data and ELIMINATE DATA SILOS across a broad spectrum of use.
process integration—real-time data Our data virtualization technologies But thinking strategically doesn’t
drives efficiency, and often without the create a logical data connectivity layer mean you have to make major
compliance and governance challenges with real-time access to every data source investments in new technologies
inherent with data warehousing. that matters. We enable users to integrate and replace what is working. At
At the same time, the shift to cloud with 250+ SaaS, NoSQL, and Big Data CData, we offer a tactical approach
and hybrid-cloud infrastructure has sources, though universally accessible to data connectivity that supports
led to growing challenges with data plug-n-play interfaces that easily broad applications across every
fragmentation. As organizations extend modern and legacy applications. facet of data management.
shift their infrastructure to cloud This means you don’t need developer Whether you are looking for real-time
technologies, data has become more resources to connect your BI, analytics, connectivity for analytics, supporting
decentralized and more challenging or reporting applications directly to enterprise IT and business units with
to leverage as an asset. While APIs live data—since those applications integration, setting up a data warehousing
provide an extensibility point for already know how to connect to a system, developing an application, or
accessing data, every integration database, they can use our drivers to building connectivity for just about
is unique, making it challenging work with any data in real-time. anything else—we’ve worked with
to extract actionable insights from As performance is critical in real- customers to support nearly every data
disparate systems. time integrations, all our connectivity connectivity need.
solutions are hyper-optimized to To learn more, visit us online at
LOGICAL DATA WAREHOUSING make the fewest requests and return www.cdata.com.
Data virtualization (DV) technologies data as quickly as possible. They
offer a contemporary approach to both transparently support features such CData Software
real-time integration and fragmentation. as bulk/batch integration and query www.cdata.com
Everyone is talking about digital transformation The zero-trust network, also known as the zero-
and the new normal of working from home, which trust architecture, was a model created in 2010 by
has inarguably brought cybersecurity threats to a new Forrester Research analyst John Kindervag, who recog-
level. With the boundaries of the network parameter nized that as data grows, so do the security threats for
all but disappearing, the risks of an attack are greater organizations across the board. Since then, the National
than ever. While remote workforces require consid- Institutes of Standards and Technology (NIST) has
erable attention, it is equally important to remember developed a free cybersecurity framework that, simi-
that cyberattacks happen everywhere. In early January lar to Kindervag’s model, helps organizations not only
2021, for example, one of the country’s largest wire- develop a shared understanding of cybersecurity risks
less carriers was hacked when employees in its physi- but also reduce them with custom measures. Created in
cal retail locations were scammed by individuals who 2014 with input from private-sector and government
brazenly downloaded software onto a store computer. experts, the framework (ratified as a NIST responsi-
After using employee credentials to gain access to bility in the Cybersecurity Enhancement Act of 2014)
the company’s customer relationship management was used by 30% of U.S. organizations in 2015 and was
system, a wide range of customer information was projected by Gartner to rise to 50% by 2020.
lifted, from PIN codes to credit card numbers. To make the most of this cybersecurity framework,
Similar stories are becoming more commonplace, it is recommended that organizations take a number of
though hacking has been a major threat to enterprises steps to classify data and know their core assets, align it
No longer
viewed as a
and government agencies for decades. Few veterans with their regulatory requirements, enforce the princi-
byproduct of information technology will forget the notorious ple of least privilege, define and layer controls to verify
of business Mafiaboy hacks in 2000. With a series of distributed each point, and define how to observe incidents.
processing, denial of service (DDoS) attacks, which bombard Whether it is a customer, partner, or employee,
data is a a site or application with so many requests that the having the ability to identify end users in a way that is
critical asset server is unable to keep up, 15-year-old Michael Calce consistently reliable is one of the most fundamen-
that enables was able to shut down the websites of E*Trade, Dell, tal controls for protecting an organization. Particularly
decision Amazon, CNN, and Yahoo. With everything heading for a growing company that is adding new users, this
making.
toward digital, it’s not surprising to see this activity can become increasingly difficult to manage when
Therefore, a
data strategy
escalating, particularly as companies rush to accel- compounded with the fact that most modern systems
must do far erate time-to-value for customers and make their involve identities from multiple sources with different
more than services more scalable and accessible. At the same protocols, federated attributes, and identity mappings.
address storage. time, the enterprise threat landscape has become
increasingly dynamic, expansive, and fluid—making A Cloud-Specific Strategy
it harder for traditional security models and controls Since most data lives in the cloud now, it’s essen-
to defend against exploits. tial to have a cloud-specific data security strategy. This
starts with data classification and takes into account all
Embedding Security By Design the elastic and agile access semantics. The next step is
While traditional approaches such as ring fencing to take a careful look at an encryption strategy—at rest,
will still be necessary, they are not enough for today’s in use, and in transit—and make sure it is understood
enterprises. What’s central to prevention is embedding how the keys are managed and refreshed. Last but not
security by design. In other words, security must be least, it is imperative to have a robust disaster recovery
integrated into software development, cloud infrastruc- plan in place.
ture, and business systems holistically—starting first This cannot be overstated: The most effective cyber-
Sam Rehman with a data strategy. No longer viewed as a byproduct of security strategy is one that is architected into an enter-
is senior vice
president business processing, data is a critical asset that enables prise’s digital ecosystem and includes proactive (offen-
and chief decision making. Therefore, a data strategy must do far sive security) and reactive (defensive security) measures.
information more than address storage. It should start with identify- Venturing into 2021 has been nothing short of per-
security officer ing enterprise data assets, then establishing a common ilous. But with a keen awareness of the threat land-
of EPAM
Systems (www. set of goals and objectives to ensure how it is safely scape and a zero-trust architecture by design, orga-
epam.com). stored, provisioned, processed, and governed—all of nizations are less likely to become another statistic
which are core to a zero-trust approach. and far more likely to gain a competitive edge.
The top reasons for using multiple database platforms are: The most popular digital transformation
projects being undertaken by organizations
right now involve cloud solutions, BI
1. Supporting multiple applications 81%
and data analytics, and cybersecurity.
2. Supporting multiple departments 54% In addition, IoT and AI/machine learning
(ML) are also important initiatives.
3. Application vendor requirements
42%
Top digital transformation priorities:
4. Supporting multiple workloads
36%
1. Cloud solutions
5. Supporting increasing data volumes
29%
2. BI or data analytics
6. M
anaging database licensing
28% 3. Cybersecurity/information security
and support costs
7. Supporting unstructured data growth 22% IoT
4.
17% AI/ML
5.
8. Deployment in multiple or hybrid cloud
9. Avoiding vendor lock-in 16% Source: “DBTA Digital Transformation and Cloud Workloads Study,”
produced by Unisphere Research and sponsored by Aerospike
You must be The past year has been a really volatile time. Automation and governance are the two critical
able to use data What are the challenges that you’re helping pillars there.
effectively in a companies navigate through? That’s right. And when I say “automation,” it’s not
timely manner There are several challenges that come to mind as just moving data from point A to point B; it’s also val-
so that you can we have worked with customers in various different idating it, running your test cases, and making sure
adapt to change, verticals over the last several years and, more impor- that they succeed before you promote from one envi-
and so that you
tantly, in the last couple of months. One is that there ronment to another environment, before you actu-
can reinvent
are more data silos being created. As you think about ally make the data available from one zone to another
some of the
business models. the speed to execute, oftentimes, that actually means zone in a trusted manner for the rest of your data
going as fast as you can and doing whatever you need consumers. All of that is front and center in terms of
to do to get the results or create the outcomes. And a DataOps approach.
that’s creating more data silos.
Has the rise of hybrid architectures combining
What is happening? multi-cloud and on-prem deployments made data
Organizations that lack a data strategy for managing management more difficult for organizations?
data across silos are struggling because once they create If you think about it, every cloud provider has
the data sprawl, they don’t have the governance, they their own way of doing things, which fits their use
don’t know how to manage data, and they don’t know cases and how they’re bringing their services to the
what data exists where. Having a single unified view of market. Now, if you’re the customer, and you’re try-
governance has become critical from our perspective ing to do these things across multiple infrastruc-
and from what we’re seeing from various customers and tures and multiple platforms, you don’t have a com-
their use cases. Along with that, organizations that do mon way of thinking about data. You don’t have a
not have a strong approach in terms of how they think common way of thinking about security. You have
about DataOps and automation are struggling because a very fragmented approach, and unless there is an
it just takes too long for them to get access to the data abstraction layer that allows you to think about this
and to give data to the right people for the right business in an organized manner, you have to build this in a
use cases. The two critical things are that, one, we see very proprietary manner each time you are stand-
companies struggling with governance and then, two, ing up these environments. That creates more chal-
we see organizations struggling with time-to-market or lenges in terms of thinking about governance and
time-to-insight. thinking about compliance to various regulatory
requirements that you may have, depending on your
What is Zaloni’s definition of DataOps? industry. All of that adds and multiplies in terms
Our view is quite simple. DataOps has emerged as of the challenges that you have to deal with as you
a discipline taking the learnings from some of the best manage data.
What is at stake for companies your business use cases. With Arena,
that don’t take a comprehensive we think about having a unified view
approach? of data across all your different envi-
To put it very simply, it’s a question ronments and we call it the three C’s.
of survivability. How do companies It’s where we catalog the data, then
survive, given that they have to adapt allow you to control the data, no mat-
to change? In the past 12 months, we ter where it exists, so that you have the
If you think about Arena and its
saw retailers whose approaches were right governance model for the data.
no longer valid because they didn’t significance, the software platform
have any traffic coming into their provides a common gathering place. And the third C?
stores. Changing to an online model, And then the third piece which is
accelerating in terms of digital trans- critically important is: How do we
formation, and making adjustments sooner than later in a very allow you to consume that data so that your data consumers can
kind of agile mode were critically important for these businesses come in and have easy access to it?
to survive. What we see is that using data to make informed deci-
sions so that you can retain and grow your customers is at stake. Looking ahead, is there a direction that you’re going in that
You must be able to use data effectively in a timely manner so that is possibly different than other companies?
you can adapt to change, and so that you can reinvent some of the There are two key things that we’re focused on. One is, now
business models. that we have a base foundation where we have all these different
capabilities along the data supply chain for enabling a DataOps
That brings us to the idea of data democratization, which approach in these organizations, we’re adding more and more
has been a major theme for Zaloni. machine learning capabilities in our platform so that we can
At the highest level, what we mean by that is that companies make data management and data governance much more intel-
need to be able to provide or enable access to data across their ligent. The idea is that as you bring in the data, our platform can
organization but do it in a meaningful way where you’re pro- automatically detect that data and not take bad records forward in
viding the right data to the right people. As an organization, you the process so that you can automatically enable trust in that data.
also have a responsibility to safeguard sensitive data. If there is Our platform also automatically detects sensitive data so that, as you
PII data, you need to think about complying with CCPA- and need to comply with various regulations, we can flag datasets that
GDPR-type regulations so that you’re protecting the data, have have PII so that you’re not making them generally available or you
role-based access control on the data, and are making sure that are automatically applying our masking and tokenization functions
you’re not letting the data be available in an ungoverned man- to anonymize that data. Things like that—that are more about aug-
ner because that, from our perspective, reduces the trust in the menting that data management approach with system-generated
data. You need to have an approach where you can say that intelligence—are what we broadly call “data intelligence.” Enabling
this is the original data, which may or may not be trusted, but data intelligence from a DataOps perspective—and from a data sup
then do something to the data to apply checks and balances ply chain perspective—is one of the key things we are focused on.
and make it more trusted so that as people in the rest of the
organization consume it, they can know that this data has been And the second?
approved by a centralized data authority. The second thing that we are all focused on is becoming that
single pane of glass—that single cockpit—for customers across
Does that tie into the rebranding last year of the Zaloni all of the different cloud providers. We are not just providing a
data platform as Arena? shim layer on top of these cloud service providers. We’re actu-
Absolutely. If you think about Arena and its significance, ally doing deep integration with these cloud service providers:
our software platform provides a common gathering place, if talking to their APIs, leveraging the innovation and the new ser-
you will, so that data and the data consumers, the data gover- vices that they’re bringing to the market, but at the same time,
nance folks, and other partners are aligned with a unified view. providing that layer of abstraction so that our customers do not
We are allowing our customers to create these experiences where have to deal with the internal details and they have much more
data’s potential is realized through collaboration and controlled portability in terms of moving the data from one environment
access across the organization. From our perspective, when we to another environment. Those are the two things that are front
talk about Arena, Arena is that space where data and informa- and center in our focus as we go forward.
tion are not only organized, accessed, and shared, but are also
transformed into insights, into something that is meaningful for Interview conducted, edited, and condensed by Joyce Wells.
A Game Changer
Jim Scott is head of developer relations, cyBERT is built to be general enough that an organization
Data Science, at NVIDIA (www.nvidia.com). can take it and train it for its custom network behavior. Instead
Over his career, he has held positions running of using the default corpus of English-language words in BERT,
operations, engineering, architecture, and QA cyBERT is developed using a custom tokenizer and represen-
teams in the big data, regulatory, digital adver-
tation trained from scratch on a large corpus of diverse cyber
tising, retail analytics, IoT, financial services,
manufacturing, healthcare, chemicals, and geographical logs. Providing a toolset powered by NLP to perform log pars-
information systems industries. ing is a game changer in the critical and time-sensitive area
of cybersecurity.
MODERNIZING DATA
FALL MANAGEMENT FOR THE HYBRID,
MULTI-CLOUD WORLD
2021 For sponsorship details, contact Stephen Faig,
[email protected], or 908-795-3702.