3 min read

What Cloud Marketplaces Do and Don’t Do

The elephant in the cloud marketplace

Not long ago, we observed here in our blog that the critical insights that drive business value come from data that is both (1) fast and (2) reliable.

For many organizations, it can be hard to achieve the speed and reliability necessary to realize these critical insights without the cloud. Cloud marketplaces like Databricks, Google Cloud, Snowflake, AWS, Microsoft Azure, and more may therefore play a role in an organization’s external-data stack. When an organization passes their third-party data through a cloud marketplace, it can enjoy the benefits of data centralization on its own terms—but without the capital expenditure of completely in-housing the data-centralization process from scratch.

As an added benefit, cloud marketplaces typically provide a modest degree of data transformation on third-party datasets—contributing to the process of helping to make those datasets transformation ready for downstream data users.

But cloud marketplaces are not a panacea. They represent an important tool for managing external data—but external data is a special kind of hard, with or without cloud marketplaces.

A good but incomplete tool

Cloud marketplaces aren’t particularly new. Yet coming into 2023, 79% of data leaders surveyed by Forrester Research said they needed a faster and more efficient way to onboard external data—with 59% experiencing slow time-to-value when onboarding new external datasets.

The problem is that the way external data gets managed—cloud marketplaces or no—has long been broken. One of Forrester’s key findings sheds light on the problem: data teams spend no more than 30% of their time on actual data analytics. The other 70% of their time is spent on the onboarding and maintenance processes—tedious keep-the-lights-on work that must get done but adds no extra business value.

That means that for every hour data teams spend on data analysis, they are spending another 2 hours and 20 minutes on preparation and upkeep.

Cloud marketplaces don’t do much to solve this problem (even with the light transformative features they may offer). They are a platform—not a product. Even when getting data from a cloud marketplace, organizations still need their own extensive data-engineering resources to onboard and maintain the data themselves, including (but not limited to):

  • Wrangling and transforming data for custom downstream schema
  • Conducting QA checks on data beyond the supplier’s own QA process
  • Monitoring data products for data-health indicators
  • Creating and sending alerts for pipeline failures, delays, schema changes, and data validation issues
  • Mitigating and repairing pipeline breaks
  • Requesting and prioritizing data backfill
  • Scheduling delivery as according to the consumer-organization’s own needs
  • Ensuring 24/7 DataOps support for data products

Additionally, cloud-marketplace data represents a copy; it is not lossless, and it is not true raw data—thereby presenting data-quality issues for data teams.

And from the perspective of a data consumer, it is incomplete. Cloud marketplaces’ data catalogs are fairly limited. To wit: Most of the data that data users want to consume doesn’t exist in a cloud marketplace.

So while cloud marketplaces have their place, they do not represent a complete solution.

The complete solution

It can be too easy to shrug off these gaps insofar as they are common to all external data—including data offered via cloud-marketplace solutions. But it doesn’t have to be that way.

As we said before, Crux prides itself on its numerous and extensive partnerships with cloud marketplaces to ensure that customers can enjoy all the perks of cloud marketplaces while still relieving their data teams of the typical external-data burdens.

Crux provides:

  • Pre-built and custom-built data transformations to match any schema and data type you prefer
  • Robust QA checks for every pipeline
  • Data-product monitoring, alerts, and proactive 24/7 mitigation and remedy for delays, schema breaks, data-validation issues, and pipeline failures
  • Proactive data backfill for licensed data products
  • Custom, transparent delivery scheduling
  • Proactive issue resolution with our extensive supplier network with 24/7 DataOps support
  • More than 60,000 pre-built pipelines from across 300 data suppliers

Crux maintains more pipelines and works with more data suppliers than all of the cloud marketplaces combined. And if there’s a dataset you want to work with that isn’t in our catalog, our data engineers will work with you to quickly onboard that too—allowing you to hit the ground running in a matter of several weeks instead of several months to years.

How Crux and cloud marketplaces make external data easy

Crux is dedicated to the idea of making external data seamlessly easy. And many organizations need a cloud-marketplace solution to help them achieve data centralization. That’s why Crux has so many dedicated, valued partnerships with cloud-marketplace providers—and why Crux and cloud marketplaces succeed together in these partnerships.

That said, making external data easy for our customers is our core business. Moreover, Crux is unique in its offerings; in addition to helping some of our customers piecemeal with difficult one-off external-data challenges, Crux is the only vendor on the market fully capable of managing your external-data processes holistically from end to end. Nobody else does what we do in the way that we do it.

Cloud marketplaces can be tactically helpful external-data facilitators. But only Crux champions last-mile external-data delivery on the way to analytics-readiness.

To learn more about how to best manage and onboard your own organization’s external data, start your external-data assessment.


What Cloud Marketplaces Do and Don’t Do

What Cloud Marketplaces Do and Don’t Do

Not long ago, we observed here in our blog that the critical insights that drive business value come from data that is both (1) fast and (2) reliable.

Read More
The 3 Dimensions of AI Data Preparedness

The 3 Dimensions of AI Data Preparedness

This past year has been exciting, representing the dawning of a new age for artificial intelligence (AI) and machine learning (ML)—with large...

Read More
How Do Small Hedge Funds Solve the Big Problem of External-Data Integration?

How Do Small Hedge Funds Solve the Big Problem of External-Data Integration?

How do you get white-glove customer service from a major data supplier?

Read More