Making Storage Work for Modern IT

In this episode of PodMagic, host Bruce Kornfeld sits down with Molly Presley, Head of Global Marketing at Hammerspace and host of the Data Unchained podcast, for a deep dive into one of today’s biggest IT challenges: making massive, distributed data actually usable. From petabyte-scale unstructured data to real-world AI deployments, Molly breaks down how Hammerspace is reshaping data orchestration — enabling enterprises to access, analyze, and move data across clouds, edge sites, and global teams without endless manual effort or data silos.

Transcript

Bruce Kornfeld

All right, welcome to Pod Magic, real conversations about solving real IT challenges. I’m your host, Bruce Kornfeld, Chief Product Officer at StorMagic. And we are always exploring ways how simple, reliable technology can help benefit you and the people you serve, whether that’s retail stores, branch offices, manufacturing sites, hospitals, anything you might be doing out there in the on-prem world. My goal is to bring interesting guests, deliver some value, and have fun along the way.

So let’s dig in. I’m very excited today. I haven’t talked to Molly in a little while, but Molly Presley is here. I’ll do a quick intro and then we’ll dive in. Molly is the head of global marketing at Hammerspace and the host of the Data Unchained podcast. She’s very famous if you don’t know that already. She’s a veteran product and marketing leader in the data storage and infrastructure space. And Molly and I have known each other for too many years. I’m not going to say the number, because I don’t want to reveal our ages, but Molly and I go way back. So welcome, Molly. Thanks for joining.

Molly Presley

It’s great to be here and so good to see you Bruce.

Bruce Kornfeld

We have a lot we could talk about. I don’t actually know where the conversation will meander to, but let’s just start off with why don’t you pretend that I don’t know what Hammerspace does, even though I know a little bit. But just explain to me and anyone else out there listening what Hammerspace is, what you guys do, and what problems you solve.

Molly Presley

So Hammerspace, if you kind of take a step back and think about why on earth even that name, it’s kind of an unusual name. If you’re into anime, comics, anything like that, that’s actually where the name originated from. Whatever you think of, whether it’s Mary Poppins’ purse and she’s pulling all these things out of this very small purse, or in the recent Spidey-verse movie, they literally use the term, he pulled his big hammer out of his hammer space. It’s the magical place where very large things can come from a very small area. Infinite size things can come out of a very small area.

And that was the founding term around our company name, which makes it easy to remember: essentially, imagine you have all this unstructured data, petabytes, exabytes, whatever you have as an organization stored in a whole bunch of locations, but you really just want to use it as a single entity. That’s Hammerspace. Interact with all those sites and all that data through one single entity or data platform, that’s what we do.

Bruce Kornfeld

The interesting overlap here for us is we’re mostly different, but somehow the same, in that we focus on small amounts of data at lots of locations, but we do have that overlap about lots of locations. So we’ll get into that as we go through here for sure. So let’s just start off with a question around unstructured data. What do you guys see as, what’s broken out there? What are you doing to help solve some of these unstructured data problems?

Molly Presley

I’d say that it’s less about broken, more about how infrastructure has evolved. Most infrastructure was designed with a one-to-one ratio of use case to data. You bought an Isilon system for your genomics processing. You bought a NetApp for your home directories. You bought Lustre for your supercomputing workload. And those data sets were purpose-built for their specific infrastructure for performance, capacity, security but they’re also locked into that infrastructure.

Now that we’ve evolved to a world where maybe you have research scientists or agentic AI projects that want access to the data that wasn’t really designed to be shared with other applications, it becomes very difficult to get access to data. So what we’re really seeing is people in architectural roles, CTOs, data architects, AI architects are trying to figure out how to create a many-to-one relationship between applications, users, and locations to their organization’s data.

And then there’s the issue that a lot of that data was not well described. I knew what I created, and you knew what you created, but no one else really knew what those data sets were for. The idea of universal metadata or tagging of data so you can find it if you weren’t the original data owner, that is fairly broken and something we’re needing to solve as an industry.

Bruce Kornfeld

It seems to me that there’s a whole industry, there’s a whole career out there of data scientists over the last decade or two, their job is to help companies figure all this out. Are you putting data scientists out of work here? Are they doing different things now? Because it sounds like you simplify this whole thing.

Molly Presley

We do, but there are two ways you can think about it. It’s kind of like with AI. Is AI going to replace you, or is it going to make you better? Is it going to make you more productive as a human? And I suppose that same question could come for data scientists and Hammerspace.

If you think about ETL and that process of “I’m going to go identify a dataset, extract it to my machine, do some work with it, and then load those results back in,” that’s very serial. Data scientist one isn’t interacting with data scientist two. They’ve all pulled their own datasets out and are doing their own work asynchronously.

What Hammerspace enables is: you can all share the same dataset. You don’t actually have to extract it to do that. You can work within your Hammerspace environment with the same data that somebody else is investigating, and have access to all the data. Instead of knowing, “I’m going to go get Bruce’s dataset on a certain genomics project,” you could search the metadata of all the data from all the people around the world that you have in Hammerspace, find what’s relevant, and load it into your Jupyter notebook.

It’s more about taking things that were historically manual and human-driven, and now that we have so much data driving automation through software instead of manual processes. It makes the data scientists more productive.

Bruce Kornfeld

One thing I’d like to ask about: let’s talk about the small site situation. I get the sense that maybe a lot of what you do is more big data, data center stuff. We can talk about that too, but let’s just think about the small sites.

One thing we find at StorMagic, one of our mantras is simplicity, because they don’t typically have IT people everywhere. Everything needs to be super simple on-prem at these smaller sites, and management needs to be centralized. So how do you think about that? How do you make what you do simple for customers who have data at lots of sites around the world?

Molly Presley

We’ve done something that seems obvious, but takes a long time to do: we’ve built our storage services into Linux. We have several Linux kernel maintainers who work for Hammerspace, and the storage services you need, things like connecting the client to the storage system or connecting data to our metadata environment are built into Linux.

When you think about that, it means any client, a sensor, a vehicle, a telescope, a small remote data center, if you’re running Linux, it can be a member of your Hammerspace system. You simply map an IP address and suddenly, as that data is being generated, we’re also creating metadata around that data.

Imagine the power of this: you don’t need a human there. You just have a Linux machine that maps to an IP address, and all of a sudden, it’s a member of this shared data platform. And even more cool, as that data is being generated, there’s metadata that’s searchable by others in the organization. Someone can say, “I’m working on this project and I’m looking for data that matches certain characteristics, oh look, some of that data was just created in this remote data center,” and then pull it into their project. No replication or manual copying required.

Because it’s built into Linux, it’s very easy. You don’t even have to load software. It’s already built into your Linux machine.

Bruce Kornfeld

In order for your software to work, the end user does not need to run an agent or anything locally? It just works?

Molly Presley

Exactly. As long as the machine is on a reasonably modern build of Linux, that’s true. Or you can connect if you’re a Windows box using something like SMB. But for big data-generating systems, those tend to be Linux these days.

Bruce Kornfeld

With what we do, it is not what you do, right? But we create the platform that customers use at all of these sites. We are Linux-based, we use a good amount of open-source software, but we harden it into an appliance. At the core, it’s Linux, right?

Molly Presley

Most things are, right?

Bruce Kornfeld

And then we create a hypervisor so customers can run whatever applications they want at their small sites. It’s starting to sound like, shame on us for not having this conversation sooner, but maybe there’s some overlap here, some go-to-market opportunities for partnership. Helping each other help customers at these small sites. That sounds exciting, actually.

Molly Presley

Yeah, absolutely. When you want to connect those small sites to other business units or projects, that sometimes gets really challenging. Hammerspace helps with that.

I’ll give you an example. Yesterday I was at a conference called the Trillion Parameter Consortium, which is basically a bunch of supercomputing and AI people trying to figure out: how do we keep going bigger, faster, with more data and more compute? But what they were really trying to figure out was: how do we connect this very distributed world to build a new language model?

The language model technology exists, and the data is scattered all over the place but the model needs access to the data. And that’s a really hard problem to solve. There are compliance issues, regulations, security, but also just: where is the data? Unified metadata is a good start.

Bruce Kornfeld

That’s a great example. We’re actually seeing something similar in our business, where edge sites used to be very isolated and autonomous. But suddenly now, every customer wants to connect them, replicate data, analyze data, train models centrally, then push models back to the edge. You said it perfectly, the architecture wasn’t built for that originally.

And now everyone’s trying to retrofit the old world into this new AI-driven world.

Molly Presley

Exactly. And what’s interesting is that people are now discovering that they don’t always need to move the data to do the work. That’s a big shift.

Historically, when you wanted to do something with data, the first step was always, “Okay, copy it or ETL it to the cluster where I’m going to analyze it.” Now if the metadata system knows where everything is, and the namespace gives you access to all of it, you can bring the compute to the data instead of the data to the compute.

That’s a really big change. And it’s required because moving petabytes around is incredibly expensive, time-consuming, and in some cases impossible.

Bruce Kornfeld

Right. The gravity of data wins every time.

Molly Presley

Exactly. Data has gravity. You either have to lighten the data, which isn’t happening, because as everyone loves to say, we’re generating more data in a day than we used to in a year, or you have to change the relationship between the data and the applications.

That’s the direction the industry is going: not “move the data,” but “access the data, wherever it lives, as if it were local.”

Bruce Kornfeld

It sounds like one of the things customers get from Hammerspace is location independence. Meaning: I don’t have to know or care where the data physically is. I can just use it.

Molly Presley

Exactly. And it works both ways. A user doesn’t need to care where it is, and the admin doesn’t have to care which storage vendor it’s on, or whether it’s in a data center or a cloud bucket. They just define policy: performance here, cost tier there, archive there, etc.

The system handles placement, movement, accessibility, security and users just work with data.

Bruce Kornfeld

So how do you think about data movement? Because it sounds like there are times where you don’t need to move the data, but there must be situations where the customer does want to move it. Is that automated? Policy based? On-demand?

Molly Presley

Yes, sometimes you do want to move the data. Maybe you want a second copy for resilience. Maybe you want it closer to a GPU cluster. Maybe the user doesn’t want WAN latency. And sometimes there are compliance or sovereignty issues that require movement.

In Hammerspace, that’s all handled through policy-based orchestration. The admin defines why data should move, for example: “all data tagged as project X must exist in region Y” or “all data older than 30 days moves to an archive tier” or “all training data must live in this cluster.” And the system moves it in the background, without interrupting access.

To the user, it still looks like one namespace. The file path never changes. Even if a file physically moves to a different site, nothing breaks.

Bruce Kornfeld

The user doesn’t see something like “mount this share, mount that share.” It’s just: here is the dataset?

Molly Presley

Exactly. One global namespace. That’s the whole point.

Bruce Kornfeld

Let’s talk a bit about AI since that’s the buzzword of everything now. How does Hammerspace fit into the world of AI? What problems do you see coming from AI projects?

Molly Presley

AI projects have exposed how bad the world of unstructured data management really was. For years, people kind of got away with messy, isolated storage silos because humans worked inside them and knew where their data lived. But AI workflows need: massive datasets, high-speed access, global visibility, metadata describing the data, ability to train in one place, infer in another, and the ability to continually update datasets

Old systems were not designed for that. Even the best legacy NAS systems were not designed for billions of small files, or exabyte scale, or 10 data centers acting like one.

AI actually forces a re-architecture of data platforms. That’s why we’re seeing all the sudden growth, customers aren’t just saying “I want better storage,” they’re saying “my AI strategy literally won’t work unless I solve this.”

Bruce Kornfeld

And what about cost? Because I keep hearing customers say things like: “we have tons of data, but we can’t afford to keep it all on expensive storage.” How do you think about cost tiers, cloud, etc.?

Molly Presley

Cost matters, especially when you’re talking petabytes or exabytes. With Hammerspace, the customer can use any storage tier under the hood: high-performance NVMe, object storage, cloud buckets, tape, whatever.

The namespace stays the same, but the storage class can change underneath based on policy. So maybe recent data stays on fast flash, and older data auto-migrates to S3 or cheap object—but the user still sees it the same way.

That’s very different from the old world where archiving meant: take it offline, make it inconvenient, make it something you hope you never need again.

Bruce Kornfeld

Let’s talk about you for a minute. You and I have been in this industry a while. We’ve seen tape, we’ve seen RAID, we’ve seen cloud, we’ve seen hyperconverged, we’ve seen object storage, and now AI. What keeps you excited? Why are you still here?

Molly Presley

I ask myself that sometimes. But honestly? The data world has never been more exciting than it is right now.

We’re finally at the moment where data is not just something you store. It’s something that creates value almost immediately. Companies that figure this out become different kinds of companies. And I like being part of helping define that.

I like being around smart people who are building things people said were “impossible” 5 years ago.

Bruce Kornfeld

And you’re doing a podcast now too. Tell us about Data Unchained.

Molly Presley

Yeah, so Data Unchained is a podcast where I interview people working on the bleeding edge of unstructured data, AI pipelines, high-performance computing, large-scale research—things like that.

We talk less about “products” and more about what’s changing in the world of data and why. And what I’ve learned is: people LOVE hearing how other organizations are solving problems that they haven’t solved yet.

You’d be shocked how many listeners are taking notes during episodes.

Bruce Kornfeld

Okay, before we wrap, I want to ask you the PodMagic standard closing question. We’re all about making IT simple. In your opinion, what’s the one thing the industry most needs to simplify right now?

Molly Presley

Hands down: make data self-describing. That is the unlock.

If every dataset carried useful metadata automatically, you wouldn’t need tribal knowledge, you wouldn’t need six meetings to find where data lives, and AI models could select their own input data without human intervention.

We don’t need smarter storage. We need smarter data.

Bruce Kornfeld

I love that. And that actually ties back into what we started talking about data discovery, metadata, and sharing between sites.

This has been awesome, Molly. I’m glad we finally caught up again. And I think we’ve uncovered some real synergy between StorMagic and Hammerspace. We should follow up for sure.

Molly Presley

Absolutely. Let’s make something happen. And thanks for having me, this was a blast.

Bruce Kornfeld

Awesome. Thanks again, Molly. And to our listeners, thanks for tuning into PodMagic. If you enjoyed today’s episode, please like, subscribe, share it with a colleague, and we’ll see you again on the next one.

Data Anywhere: Making Storage Work for Modern IT with Molly Presley

Transcript