Using GridFS to store files in MongoDB (Java)

Feb 26, 2019 by Thibault Debatty | 6757 views

https://cylab.be/blog/16/using-gridfs-to-store-files-in-mongodb-java

MongoDB is a fantastic tool for storing large quantities of data in a JSON-like format.

Furthermore, it can be used to store binary data with GridFS. This has multiple advantages:

your files and metadata is automatically synced and distributed over your MongoDB cluster;
unlike some filesystems, GridFS has no maximum number of files;
GridFS allows to access portions of a large file without having to read the whole file into memory.

Technically, GridFS is used to store and retrieve files that exceed the bson-document size limit of 16 MB. Instead of storing the file in a single document, GridFS divides the file into chunks and stores each chunk as a separate document.

Storing

To store a file in GridFS using Java:

//Bucket to generate GridFSUploadStreams
GridFSBucket bucket = GridFSBuckets.create(
  mongo_client.getDatabase(db_name),
  bucket_name);

//Each file needs a different uploadStream
GridFSUploadStream uploadStream = bucket.openUploadStream(filename);
byte[] data = ...;
uploadStream.write(data) ;

ObjectId fileid = uploadStream.getObjectId() ;
uploadStream.close() ;

Retrieving

The file can now be retrieved using its fileid or filename :

FileOutputStream streamToDownloadTo = new FileOutputStream(local_file);

//Using fileid
bucket.downloadToStream(fileid, streamToDownloadTo) ;

//Using mongo filename
bucket.downloadToStream(filename, streamToDownloadTo);
streamToDownloadTo.close() ;

You can also retrieve the data as a byte[] :

GridFSDownloadStream downloadStream = bucket.openDownloadStream(fileid);
int fileLength = (int) downloadStream.getGridFSFile().getLength();
byte[] data = new byte[fileLength];
downloadStream.read(bytesToWriteTo);
downloadStream.close();

Mongo shell

Finally, you can use Mongo shell to check and retrieve your files:

References:

This blog post is licensed under CC BY-SA 4.0