Using Compression for Artifacts in Peak-SDK and Peak-CLI

Why do we need Artifacts and Compression

If you are creating a resource which is dependent on an Image (such as App spec, Block spec or Image) and you don’t want it to be source directly from the source code repository, you will need to pack all the files you might need to build the image and pack them into an zip archive (all platform APIs only accept zip compressed files) which we call Artifact. Both the CLI and the SDK have utilities to help you create and manage Artifact(s).

To prepare an artifact for upload you will need to specify the path to the directory which contains all the files you need to build the image and also an optional list of files to ignore. The CLI and SDK will then create a zip archive of all the files in the directory and ignore the files which match the patterns in the ignore files.

Usage Examples

Consider the following directory structure:

root_dir
├── .dockerignore
├── .gitignore
├── Dockerfile
├── dir1
│   ├── file.txt
│   └── main.py
└── dir2
    ├── file.txt
    └── main.py

Contents of .gitignore file

*
!**/*.txt

Contents of .dockerignore file

*
!Dockerfile
!**/*.py

Basic usage with path and ignore_files

This example shows the usage of the compression module, which will create a zip file named artifact.zip and put all files in the current directory into that zip. It will also ignore the files which match the patterns listed in the .gitignore file.

from peak.compression import get_files_to_include, compress

with compress(".", ignore_files=[".gitignore"]) as zip:
    with open("artifact.zip", "wb") as f:
        f.write(zip.read())

File structure of the zip file:

root_dir
├── dir1
│   ├── file.txt
└── dir2
    └── file.txt

Usage without ignore_files

ignore_files is an optional argument and if not provided it will search for a .dockerignore file in the given path and use that as ignore file.

from peak.compression import get_files_to_include, compress

with compress(".") as zip:
    with open("artifact.zip", "wb") as f:
        f.write(zip.read())

File structure of the zip file:

root_dir
├── Dockerfile
├── dir1
│   └── main.py
└── dir2
    └── main.py

Compress without ignore_files and a missing .dockerignore

If ignore_files argument is not given and the .dockerignore file is also not present in path then all the files present under path will be included in the zip file.

from peak.compression import get_files_to_include, compress

with compress(".") as zip:
    with open("artifact.zip", "wb") as f:
        f.write(zip.read())

File structure of the zip file:

root_dir
├── .gitignore
├── Dockerfile
├── dir1
│   ├── file.txt
│   └── main.py
└── dir2
    ├── file.txt
    └── main.py

Compress without both ignore_files and .dockerignore

If ignore_files argument is not given and also the .dockerignore file is not present at the provided path then it will try to zip all the files.

from peak.compression import get_files_to_include, compress

with compress(".") as zip:
    with open("artifact.zip", "wb") as f:
        f.write(zip.read())

File structure of the zip file:

root_dir
├── .dockerignore
├── .gitignore
├── Dockerfile
├── dir1
│   ├── file.txt
│   └── main.py
└── dir2
    ├── file.txt
    └── main.py

Usage with multiple ignore files

If multiple ignore files are passed then the order of precedence is lowest to highest. i.e. it will give the lowest priority to the file that’s first on the list and highest priority to the file that’s last on the list. In the example below all patterns in the .gitignore will have higher precedency than the ones in the .dockerignore file.

from peak.compression import get_files_to_include, compress

with compress(".", ignore_files=[".gitignore", ".dockerignore"]) as zip:
    with open("artifact.zip", "wb") as f:
        f.write(zip.read())

File structure of the zip file:

root_dir
├── Dockerfile
├── dir1
│   └── main.py
└── dir2
    └── main.py