Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ipfs in web apps guide #1970

Merged
merged 25 commits into from
Feb 11, 2025
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .github/styles/pln-ignore.txt
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,7 @@ mainnet
markdown(lint)
markdownlint
merkle
merklizing
metadata('s)
metamask
minimalistic
Expand All @@ -140,6 +141,7 @@ multiaddrs
multibase
multicast
multicodec
multicodec(s)
multiformats
multihash
multihashes
Expand Down Expand Up @@ -217,6 +219,7 @@ trustlessly
uncensorable
undialable
uniswap
unixfs
unreachability
untrusted
upgradeability
Expand Down
6 changes: 3 additions & 3 deletions docs/.vuepress/config.js
Original file line number Diff line number Diff line change
Expand Up @@ -255,12 +255,12 @@ module.exports = {
]
},
{
title: 'IPFS in the browser',
title: 'IPFS on the web',
sidebarDepth: 1,
collapsable: true,
children: [
'/how-to/ipfs-in-web-apps',
'/how-to/address-ipfs-on-web',
'/how-to/browser-tools-frameworks'
]
},
{
Expand All @@ -277,7 +277,7 @@ module.exports = {
collapsable: true,
children: [
'/how-to/gateway-best-practices',
'/how-to/gateway-troubleshooting'
'/how-to/gateway-troubleshooting',
]
},
{
Expand Down
1 change: 1 addition & 0 deletions docs/.vuepress/redirects
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@
/guides/guides/addressing/ /how-to/address-ipfs-on-web
/guides/guides/install/ /install
/how-to/host-single-page-site /how-to/websites-on-ipfs/single-page-website
how-to/browser-tools-frameworks /how-to/ipfs-on-the-web/
2color marked this conversation as resolved.
Show resolved Hide resolved
/how-to/troubleshoot-file-transfers /how-to/troubleshooting
/how-to/run-ipfs-inside-docker /install/run-ipfs-inside-docker
/install/command-line-quick-start/ /how-to/command-line-quick-start
Expand Down
21 changes: 11 additions & 10 deletions docs/concepts/lifecycle.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,25 +5,26 @@

# The lifecycle of data in IPFS

- [1. Content-addressable representation](#1-content-addressable-representation)
- [2. Pinning](#2-pinning)
- [3. Retrieval](#3-retrieval)
- [4. Deleting](#4-deleting)
- [1. Content-addressing](#1-content-addressing)
- [2. Providing](#2-providing)
- [3. Retrieving](#3-retrieving)
- [Learn more](#learn-more)

## 1. Content-addressable representation
## 1. Content-addressing / Merkleizing

Check failure on line 13 in docs/concepts/lifecycle.md

View workflow job for this annotation

GitHub Actions / pr-content-check

[vale] reported by reviewdog 🐶 [docs.PLNSpelling] Did you really mean 'Merkleizing'? Raw Output: {"message": "[docs.PLNSpelling] Did you really mean 'Merkleizing'?", "location": {"path": "docs/concepts/lifecycle.md", "range": {"start": {"line": 13, "column": 28}}}, "severity": "ERROR"}

The file is transformed into a content-addressable representation using a CID. The basic idea is that this representation makes files and directories **content-addressable** via CIDs by chunking files into smaller blocks, calculating their hashes, and constructing a [Merkle DAG](./merkle-dag.md).
The first stage in the lifecycle of data in IPFS is to address it by CID. This is a local operation that takes arbitrary data and encodes it so it can be addressed by a CID.

## 2. Pinning
The exact process depends on the type of data. For files and directories, this is done by constructing a [UnixFS](./file-systems.md#unix-file-system-unixfs) [Merkle DAG](./merkle-dag.md). For other data types, such as dag-cbor, this is done by encoding the data with [dag-cbor](https://ipld.io/docs/codecs/known/dag-cbor/) which is hashed to produce a CID.

## 2. Providing

In this stage, the blocks of the CID are saved on an IPFS node (or pinning service) and made retrievable to the network. Simply saving the CID on the node does not mean the CID is retrievable, so pinning must be used. Pinning allows the node to advertise that it has the CID, and provide it to the network.

- **Advertising:** In this step, a CID is made discoverable to the IPFS network by advertising a record linking the CID and the server's IP address to the [DHT](./dht.md). Advertising is a continuous process that repeats typically every 12 hours. The term **publishing** is also commonly used to refer to this step.

- **Providing:** The content-addressable representation of the CID is persisted on one of web3.storage's IPFS nodes (servers running an IPFS node) and made publicly available to the IPFS network.

## 3. Retrieval
## 3. Retrieving

In this stage, an IPFS node fetches the blocks of the CID and constructs the Merkle DAG. This usually involves several steps:

Expand All @@ -35,13 +36,13 @@

- **Local access:** Once all blocks are present, the Merkle DAG can be constructed, making the file or directory underlying the CID successfully replicated and accessible.

## 4. Deleting
<!-- ## 4. Deleting

At this point, the blocks associated with a CID are deleted from a node. **Deletion is always a local operation**. If a CID has been replicated to other nodes, it will continue to be available on the IPFS network.

:::callout
Once the CID is replicated by another node, it is typically advertised to DHT by default, even if it isn't explicitly pinned.
:::
::: -->

## Learn more

Expand Down
32 changes: 0 additions & 32 deletions docs/how-to/browser-tools-frameworks.md
2color marked this conversation as resolved.
Show resolved Hide resolved

This file was deleted.

107 changes: 107 additions & 0 deletions docs/how-to/ipfs-in-web-apps.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
---
title: IPFS in web applications
description: How to develop applications that use IPFS in web browsers, including IPFS retrieval and pinning in browsers using implementations such as Helia.
---

# IPFS in web-applications and resource-constrained environments
2color marked this conversation as resolved.
Show resolved Hide resolved

In this guide you will learn how to use IPFS using JavaScript/TypeScript in web applications, including addressing data with CIDs, retrieval by CID, working with CAR files, and the the nuances of providing.
2color marked this conversation as resolved.
Show resolved Hide resolved

For this, you will use [Helia](https://github.com/ipfs/helia), the most actively maintained implementation of IPFS in TypeScript for use on the web.

> **Note:** this guide is focused solely on using IPFS for data within a web application. It does _not_ cover using IPFS for static website distribution with IPFS Gateways.

## Challenges with IPFS on the web

IPFS allows you to fetch data by CID from multiple providers without being reliant on a single authoritative server.

However, making all of this work on the web is tricky due to networking constraints. Browsers impose many restrictions on web apps, for example, opening TCP/UDP connections is not possible. Instead, web apps are constrained to HTTP, WebSockets, WebRTC, and most recently WebTransport.

There are good reasons for this like security and resource management, but ultimately, it means that using IPFS on the web is different to native binaries.

## Key IPFS operations: Addressing, Retrieving, and Providing

As a developer, IPFS exposes three main operations for interacting with the network:

- **Addressing data with CIDs** (also known as merkleizing): taking arbitrary data and encoding so its addressable by CID. For example, given a file and encoding it so it can be addressed by a CID.

Check failure on line 26 in docs/how-to/ipfs-in-web-apps.md

View workflow job for this annotation

GitHub Actions / pr-content-check

[vale] reported by reviewdog 🐶 [docs.PLNSpelling] Did you really mean 'merkleizing'? Raw Output: {"message": "[docs.PLNSpelling] Did you really mean 'merkleizing'?", "location": {"path": "docs/how-to/ipfs-in-web-apps.md", "range": {"start": {"line": 26, "column": 48}}}, "severity": "ERROR"}
- **Retrieving data by CID**: given a CID, IPFS finds providers (peers who share the block), connects to them, fetches the blocks, and verifies that the retrieved data is what the CID represents.
- **Providing data by CID**: making data addressed by a CID retrievable by other peers, either by running a node or with a pinning service.

## Addressing data by CID

As mentioned above, the first step in the [lifecycle of data in IPFS](../concepts/lifecycle.md) is to address it by CID.

When addressing data by [CIDs](https://proto.school/anatomy-of-a-cid/03) you will need to choose:

- [hash function](../concepts/glossary.md#hash-function). For use in browsers, the default and recommended hash function is `sha2-256` which is also the default for [helia](https://github.com/ipfs/helia).
2color marked this conversation as resolved.
Show resolved Hide resolved
- [multicodec](../concepts/glossary.md#multicodec), which is the format of the data you are addressing and is used to help decode data. CIDs support a wide range of multicodecs, but for most intents and purposes, you will likely either want use:

Check failure on line 37 in docs/how-to/ipfs-in-web-apps.md

View workflow job for this annotation

GitHub Actions / pr-content-check

[vale] reported by reviewdog 🐶 [docs.PLNSpelling] Did you really mean 'multicodecs'? Raw Output: {"message": "[docs.PLNSpelling] Did you really mean 'multicodecs'?", "location": {"path": "docs/how-to/ipfs-in-web-apps.md", "range": {"start": {"line": 37, "column": 166}}}, "severity": "ERROR"}
- [UnixFS](../concepts/file-systems.md#unix-file-system-unixfs) for files and directories.
- [dag-cbor](../concepts/glossary.md#dag-cbor) for json-like structured data with binary encoding. DAG-CBOR is an extension of CBOR that adds a "link" type for CIDs, allowing for the creation of interlinked CBOR objects (which can be used to form larger linked data structures).

### CID Determinism

One important thing to note is that **the same data can result in different CIDs** depending on a number of factors, including the hash function, the multicodec you use, and the multicodec. **This is especially true for files**, where the same file, hash function and multicodec can still result in different CIDs depending on the different options that UnixFS supports.
2color marked this conversation as resolved.
Show resolved Hide resolved

See the [forum discussion on CID profiles](https://discuss.ipfs.tech/t/should-we-profile-cids/18507) and the [DASL](https://dasl.ing/) initiative for more for more information on the nature of this problem and how the community is addressing it.

For a visual demonstration of this, try the [DAG Builder](https://dag.ipfs.tech/), which visualises how files are addressed by CID with UnixFS and demonstrates how the same file can result in different CIDs, depending on the different options that UnixFS supports.

### Example: Addressing an object by CID with dag-cbor

For example, to address an object by CID with the `dag-cbor` multicodec and `sha2-256` hash function, you can use the following code using [Helia](https://github.com/ipfs/helia):

<iframe height="300" style="width: 100%;" scrolling="no" title="Addressing an object by CID with Helia and dag-cbor" src="https://codepen.io/2color/embed/xbKpJKx?default-tab=js%2Cresult" frameborder="no" loading="lazy" allowtransparency="true" allowfullscreen="true">
See the Pen <a href="https://codepen.io/2color/pen/xbKpJKx">
Addressing an object by CID with Helia and dag-cbor</a> by Daniel Norman (<a href="https://codepen.io/2color">@2color</a>)
</iframe>

### Example: Addressing a file by CID with unixfs

<iframe height="500" style="width: 100%;" scrolling="no" title="Addressing an image by CID with Helia and UnixFS" src="https://codepen.io/2color/embed/zxONqPj?default-tab=js%2Cresult" frameborder="no" loading="lazy" allowtransparency="true" allowfullscreen="true">
See the Pen <a href="https://codepen.io/2color/pen/zxONqPj">
Addressing an image by CID with Helia and UnixFS</a> by Daniel Norman (<a href="https://codepen.io/2color">@2color</a>)
</iframe>

## Retrieval

From a high level, there are several ways to retrieve data with IPFS in web applications:

- Using the [`Verified Fetch`](https://www.npmjs.com/package/@helia/verified-fetch) library, which was modelled after the `fetch` API and returns `Response` objects, with the main difference being that it allows you to fetch data by CID, abstracting away the details of content routing, transports and retrieval. For more examples and background see the [release blog post](https://blog.ipfs.tech/verified-fetch/).
2color marked this conversation as resolved.
Show resolved Hide resolved
- Using the [`helia`](https://github.com/ipfs/helia/) library, which is the foundation for the `verified-fetch` library, and provides a more comprehensive and modular API for interacting with the IPFS network, beyond just retrieval.
2color marked this conversation as resolved.
Show resolved Hide resolved
- Using public recursive gateways, e.g. `ipfs.io` with HTTP. This is not recommended for most use cases, because it forgoes the verifiability and trustlessness enabled by content addressing. Granted, it might be the easiest way to retrieve data in a web application, but is also the most fraught with security and centralization concerns.

Check failure on line 71 in docs/how-to/ipfs-in-web-apps.md

View workflow job for this annotation

GitHub Actions / pr-content-check

[vale] reported by reviewdog 🐶 [docs.PLNSpelling] Did you really mean 'trustlessness'? Raw Output: {"message": "[docs.PLNSpelling] Did you really mean 'trustlessness'?", "location": {"path": "docs/how-to/ipfs-in-web-apps.md", "range": {"start": {"line": 71, "column": 147}}}, "severity": "ERROR"}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to mention that our goal is to push away from public backend infrastructure like this and that we aim more toward trustless gateways?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should link to the gateways page which is pretty up to date with these nuances. Also are you refering to trustless gateway providers or just recursive trustless gateways?

Copy link
Member

@SgtPooki SgtPooki Feb 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our team goals include moving away from recursive trustless gateways AFAIK, but we will probably always run a trustless gateway provider. I'd have to defer to Lidel/Adin for a 5-10-year plan. We don't need to mention it if it's that far ahead.

My main point was that, besides informing the readers that it's not recommended, we should also indicate that we are "tightening our purse strings" regarding hosting those services.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree completely. Just not sure if this belongs here. This information about the general shift applies to a lot of content in the docs, so maybe worthy of it's own section in the gateways page.

Copy link
Member

@lidel lidel Feb 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree there is need to communicate "listen, gateways may disappear st some point, look at Cloudflare one not being a thing anymore, don't build things on sand". But it is out of scope here, work for future PR.

Virtually all pages that touch gateways needs to be redone, we have too many pages with partial and often outdated info (created before https://specs.ipfs.tech/http-gateways/ existed).


### Example: Image retrieval with Verified Fetch

<iframe height="500" style="width: 100%;" scrolling="no" title="Fetch an image on IPFS Mainnet @helia/verified-fetch" src="https://codepen.io/2color/embed/QWXKZGx?default-tab=js%2Cresult" frameborder="no" loading="lazy" allowtransparency="true" allowfullscreen="true">
See the Pen <a href="https://codepen.io/2color/pen/QWXKZGx">
Fetch an image on IPFS Mainnet @helia/verified-fetch</a> by Daniel Norman
</iframe>

## Providing data

For data to be retrievable by other peers on [IPFS Mainnet](../concepts/glossary.md#ipfs-mainnet) it will need to be uploaded to a pinning service or an IPFS node.

When possible, it's best to rely on client-side merklization to address data by CID and then upload it to a pinning service or a node. [CAR files](#car-files) are a great way to do this, though they are not supported by all pinning services.

Check failure on line 84 in docs/how-to/ipfs-in-web-apps.md

View workflow job for this annotation

GitHub Actions / pr-content-check

[vale] reported by reviewdog 🐶 [docs.PLNSpelling] Did you really mean 'merklization'? Raw Output: {"message": "[docs.PLNSpelling] Did you really mean 'merklization'?", "location": {"path": "docs/how-to/ipfs-in-web-apps.md", "range": {"start": {"line": 84, "column": 49}}}, "severity": "ERROR"}
2color marked this conversation as resolved.
Show resolved Hide resolved
2color marked this conversation as resolved.
Show resolved Hide resolved

### You probably don't want to provide data from a browser

Browsers make for lousy servers. It's difficult to make a Web page a server, i.e. allow network incoming connections from other computers. WebRTC is the only exception, however, it has many caveats, and doesn't work in all networks.

For this reason, you should never count on providing data from a browser to work.

Instead, you should provide data from a long-running server that runs reliably and has a public IP. That can be a Kubo node that you run, or a [pinning service](../concepts/persistence.md#pinning-services).

### CAR files

The Content Archive format is a way of packaging up content addressed data into archive files that can be easily stored and transferred over the network. You can think of them like TAR files that are designed for storing collections of content addressed data.

So why would you want to use CAR files?

One of the main reasons is related to [CID determinism](#cid-determinism). As mentioned above, the same data can result in different CIDs, which can make it difficult to verify data without its content addressed representation. By packaging up the data into a CAR file, you can upload the CAR to multiple pinning services and nodes knowing they are providing the same CIDs

Car files are a great way to store content-addressed data in a way that is easy to transport and store, and Helia (and other implementations) allow you to both export and import any data you've addressed by CID into a CAR file.

<iframe height="300" style="width: 100%;" scrolling="no" title="CAR export with Helia and dag-cbor" src="https://codepen.io/2color/embed/EaYoegX?default-tab=js%2Cresult" frameborder="no" loading="lazy" allowtransparency="true" allowfullscreen="true">
See the Pen <a href="https://codepen.io/2color/pen/EaYoegX">
CAR export with Helia and dag-cbor</a> by Daniel Norman (<a href="https://codepen.io/2color">@2color</a>)
</iframe>
2color marked this conversation as resolved.
Show resolved Hide resolved
Loading