In any discussion of AI today, it's hard to avoid talking about Hugging Face, and for good reason! Hugging Face is a platform hosting over 100,000 pre-trained machine-learning models. These models cover diverse tasks such as:

  • Text generation: Create poems, scripts, email, and more.
  • Question answering: Get answers from factual models about various topics.
  • Text classification: Label text with categories like sentiment or topic.
  • Image classification: Identify objects or scenes in images.
  • Speech recognition: Convert audio to text.

Many of these models, such as Llama 2, Mistral, and Dall-E, you may have heard of, but Hugging Face also hosts tens of thousands of others, such as Whisper for Automatic Speech Recognition and MAGNeT for text-to-music generation.

But with such a large collection, it would be, as you might guess, confusion to deal with all of these models. Fortunately, Hugging Face has an API that simplifies this process. That's where Hugging Face JS comes in.

Hugging Face JS is a collection of JavaScript libraries that help you interact with the Hugging Face API.

Hugging Face JS provides three main libraries:

  1. @huggingface/inference: Makes calls to the Hugging Face API, enabling you to use any of those 100,000+ pre-trained models in your JavaScript project. Think of it as a bridge between your code and the AI models.
  2.  @huggingface/agents: Offers a natural language interface to interact with Hugging Face models. Instead of writing code, you simply ask questions or give instructions in plain English. This is still under development, but it promises a more accessible way to leverage AI.
  3.  @huggingface/hub: Interact with huggingface.co to create or delete repos and commit/download files

Here are some key things to know about Hugging Face JS:

  • Modern and Efficient: It uses modern JavaScript features and avoids unnecessary dependencies, making it fast and lightweight.
  • Focus on Pre-trained Models: It's primarily designed to use existing models, not train your own. This simplifies development and lets you access powerful AI without being an expert.
  • Community-Driven: Hugging Face has a large and active community, with extensive documentation, tutorials, and examples to help you get started.

Overall, Hugging Face JS makes it easier than ever to integrate AI into your JavaScript projects. Whether you're building web applications, mobile apps, or even chatbot experiences, Hugging Face JS provides a convenient and powerful toolkit to leverage the vast potential of pre-trained machine learning models.

Let's see how this works using an example by writing a simple code in JavaScript.

Getting started with huggingface.js

The first thing we need to do is get set up with an access token. 

Follow these steps:

  1. First of all, visit the official website — huggingface.co. Sign up and sign in.

  1. Next retrieve a user access token (API key). Go to your profile → settings → access tokens (or simply click here), then click the “New Token” button.

  1. In the modal window that opens, enter the name of the future token to easily understand why it was generated, for example, “My first test project” and click “Generate token”:

  1. Once the token is generated, in the list that appears, click the "copy" button to copy the token:

OK!  Now that we have a Hugging Face access token, we're ready to write some actual code.

Let’s code!

To show all of this in action, we're going to create a JavaScript script that uses the <code>nlpconnect/vit-gpt2-image-captioning</code> model to create a caption of an image we'll feed to it.

This article assumes that you already have NodeJS installed. If this is not the case, first install it according to the instructions from the official website: nodejs.org

You can use any IDE or even just a notepad; we just need the ability to work with NodeJS in the CLI. I’m going to use IntelliJ Idea by JetBrains, but you can use the tools of your choice.

Follow these steps:

  1. If you're using an IDE, create a new project (or just open a new directory):
  1. Open the Terminal and run <code>npm init -y</code> no initiate your empty NodeJS project:

  1. Once created, in your new <code>package.json</code> file add the “type“ property after “main” and give it a value of <code>"module"</code>:

  1. Let’s import the required libraries, <code>dotenv</code> and <code>inference</code> by Huggingface:
npm install @huggingface/inference dotenv --save
  1. Then create an empty .env file in the root directory of your project and add the Hugging Face token you copied earlier:
HF_ACCESS_TOKEN="<hf_token>"

Create the actual script

All preparations are completed, so now it’s time to code our main script. We're going to use an image captioning model that will describe what is shown in the image.

Follow these steps:

  1. Create a file <code>index.js</code>. First of all, let’s import the required libraries:
import { HfInference } from '@huggingface/inference'
import dotenv from 'dotenv'
  1. Then initiate our <code>.env</code> config which contains the HF access token:
dotenv.config()
  1. Next, we need to set some values. The <code>inference</code> constant initiates the corresponding library using our access token, <code>model</code> contains model’s name (<code>nlpconnect/vit-gpt2-image-captioning</code> in our case), and <code>imageUrl</code> contains our test image URL. In this case, it's a picture of a black dog:

OK, let's go ahead and set up these constants:

const inference = new HfInference(process.env.HF_ACCESS_TOKEN) // inference library loads access token from the .env file
const model = 'nlpconnect/vit-gpt2-image-captioning' // model name
const imageUrl = 'https://picsum.photos/id/237/536/354' // your image URL
  1. Using the values above, let's get results and output them!
// Fetch the image as a blob
let response = await fetch(imageUrl)
let imageBlob = await response.blob()

// Use the ImageToText function to get the caption
const results = await inference.imageToText({
  data: imageBlob,
  model: model,
})

// Output model's response
console.log(results)
  1. Now we can run our code by typing <code>node index.js</code> in the Terminal:

The model responds with <code>a black dog with a black collar and a wooden bench</code>, which seems quite likely (although it is not clear whether they are wearing a collar).

That’s it! You’ve just coded your first app using huggingface.js. From here, you can generalize to other models.

Here’s the full code snippet:

// Importing required libraries
import { HfInference } from '@huggingface/inference'
import dotenv from 'dotenv'

dotenv.config()

// Set some constants
const inference = new HfInference(process.env.HF_ACCESS_TOKEN) // inference library loads access token from the .env file
const model = 'nlpconnect/vit-gpt2-image-captioning' // model name
const imageUrl = 'https://picsum.photos/id/237/536/354' // your image URL

// Fetch the image as a blob
let response = await fetch(imageUrl)
let imageBlob = await response.blob()

// Use the ImageToText function to get the caption
const results = await inference.imageToText({
  data: imageBlob,
  model: model,
})

// Output model's response
console.log(results)

The model responds with a black dog with a black collar and a wooden bench, which seems quite likely (although it is not clear whether they are wearing a collar).

Happy coding! If you have questions, or if you need something more complicated, please don't hesitate to contact us.

References

  1. huggingface.co/docs/huggingface.js/index
  2. github.com/huggingface/huggingface.js
  3. linkedin.com/pulse/huggingfacejs-quick-guide-getting-started-olu-aganjuomo-xl8re