API Reference¶

This module provides multiple parsers for RFC-7578 multipart/form-data, both low-level for framework authors and high-level for WSGI or ASGI application developers.

https://multipart.readthedocs.io/

Non-Blocking Parser¶

class multipart.PushMultipartParser¶

An incremental non-blocking parser for multipart/form-data.

This class provides a non-blocking parse() method as well as several convenience methods to parse blocking or async data streams.

Strict mode: In strict mode, the parser will be less forgiving and bail out more quickly when presented with strange or invalid input, avoiding unnecessary work caused by broken or malicious clients. Fatal errors will always trigger exceptions, even in non-strict mode.

Limits: The various max_* limits are meant as safeguards and exceeding any of those limit will trigger ParserLimitReached.

Parser instances can be used as context managers in a with statement to ensure that close() is called after leaving the parser loop. This is important to detect incomplete multipart streams.

__init__(boundary: str | bytes | bytearray, content_length=-1, max_header_size=4224, max_header_count=8, max_segment_size=inf, max_segment_count=inf, header_charset='utf8', strict=False)¶

Create a new parser instance.

Parameters:

boundary – A valid multipart boundary as found in the Content-Type header.
content_length – Expected input size in bytes, or -1 if unknown.
max_header_size – Maximum length of a single header line (name and value).
max_header_count – Maximum number of headers per segment.
max_segment_size – Maximum size of a single segment body.
max_segment_count – Maximum number of segments.
header_charset – Charset for header names and values.
strict – Enables additional format and sanity checks.

closed¶: True if the parser was closed, either successfully by reaching the end of the multipart stream, or due to an error.

error: MultipartError | None¶: A MultipartError instance if parsing failed.

__exit__(exc_type, exc_val, exc_tb)¶: Close the parser. If the call was caused by an exception, the final check for a complete multipart stream is skipped to avoid another exception.

parse(chunk: bytes | bytearray) → Generator[MultipartSegment | bytes | None, None, None]¶

Parse a chunk of data and yield as many parser events as possible from the given chunk.

Parser Events: For each multipart segment the parser will emit a single instance of MultipartSegment with header and meta information, followed by zero or more non-empty bytes with chunks from the segment body, followed by a single None event to signal the end of the current segment.

This method does not perform any IO on its own. It stops yielding events if more data is needed and should be called again with the next chunk to continue. The returned generator must be fully consumed before parsing the next chunk. Once the end of the multipart stream is reached and the last event was emitted, closed will be true.

End of input can be signaled by parsing an empty chunk, calling close() or using the parser as a context manager and leaving the context. Closing the parser is important to be notified about incomplete or missing segments.

Parameters:

chunk – A non-empty chunk of data, or an empty chunk to signal end of input.

Raises:

ParserError – Input is not a valid multipart stream.
StrictParserError – Unusual input while parsing in strict mode.
ParserLimitReached – One of the configured limits reached.
ParserStateError – Parser used incorrectly (e.g. use after close).

async parse_async(read: Callable[[int], Awaitable[bytes | bytearray]], chunk_size=65536) → AsyncGenerator[MultipartSegment | bytes | None, None]¶

Parse the entire multipart stream by reading chunks from an async read(size) function. The returned async generator yields parser events similar to parse() and can be used in async for loops.

The async read(size) function should read and return up to size bytes of data per call. Returning an empty chunk signals the end of the input stream. This is compatible with asyncio.StreamReader.read() and most other async read methods.

The parser will not try to read more than content_length bytes, if known. It will also stop reading once the end of the multipart stream is detected, even if more data is available in the input stream.

Parameters:

read – An async read function returning chunks of data from an input stream.
chunk_size – A positive integer limiting how many bytes are requested per read operation.

Yields:

Parser events (see parse())

Raises:

Exception – Exceptions raised by read are not handled.
MultipartError – Same as parse().

Added in version 1.3.

parse_blocking(read: Callable[[int], bytes | bytearray], chunk_size=65536) → Generator[MultipartSegment | bytes | None, None, None]¶

Parse the entire multipart stream by reading chunks from a blocking read(size) function. The returned generator yields parser events similar to parse() and can be used in for loops.

The blocking read(size) function should read and return up to size bytes of data per call. Returning an empty chunk signals the end of the input stream.

The parser will not try to read more than content_length bytes, if known. It will also stop reading once the end of the multipart stream is detected, even if more data is available in the input stream.

Parameters:

read – A blocking read function returning chunks of data from an input stream.
chunk_size – A positive integer limiting how many bytes are requested per read operation.

Yields:

Parser events (see parse())

Raises:

Exception – Exceptions raised by read are not handled.
MultipartError – Same as parse().

Added in version 1.3.

close(check_complete=True)¶

Close this parser if not already closed.

Parameters:: check_complete – Raise ParserError if the parser did not reach the end of the multipart stream yet.

class multipart.MultipartSegment¶

A MultipartSegment represents the header section of a single multipart part and provides convenient access to part headers and other details (e.g. name and filename).

Segment instances do not store or buffer any payload data, but the parser will count bytes_received while parsing, and make the final size available once the segment is complete.

__init__(headerlist: List[Tuple[str, str]])¶: Private constructor used by PushMultipartParser

bytes_received: int¶: Segment body bytes received so far. Will be updated for each chunk of payload during parsing.

complete: bool¶: True if the parser detected the end of the segment and size is final.

headerlist: List[Tuple[str, str]]¶: Ordered list of headers as (name, value) pairs. Header names are normalized (Title-Case) and values are stripped of leading or tailing whitespace.

disposition: str | None¶: The lower-cased Content-Disposition segment header without header options. This will be 'form-data' for valid HTTP form submissions.

name: str | None¶: The segment 'name' as specified in the Content-Disposition segment header. For form-data this will always be a string, but the string may be empty.

filename: str | None¶: An optional 'filename' if specified in the Content-Disposition header.

content_type: str | None¶: The lower-cased Content-Type segment header without header options.

charset: str | None¶: The optional 'charset' option of the Content-Type header.

property size¶

Final segment body size in bytes.

Only available after the segment is complete. Use bytes_received if you need the in-progress byte count while parsing.

header(name: str, default=None)¶: Return the value of a header if present, or a default value.

__getitem__(name)¶: Return a header value if present, or raise KeyError.

Buffered Parser¶

class multipart.MultipartParser¶

__init__(stream, boundary, content_length=-1, charset='utf8', strict=False, buffer_size=65536, header_limit=8, headersize_limit=4224, part_limit=128, partsize_limit=inf, spool_limit=65536, memory_limit=8388608, disk_limit=inf, mem_limit=0, memfile_limit=0)¶

A parser that reads from a multipart/form-data encoded byte stream and yields buffered MultipartPart instances.

The parse acts as a lazy iterator and will only read and parse as much data as needed to return the next part. Results are cached and the same part can be requested multiple times without extra cost.

Note that you should either set partsize_limit or disk_limit depending on your specific requirements. Both are unlimited by default.

Parameters:

stream – A readable byte stream or any other object that implements a read(size) method.
boundary – The multipart boundary as found in the Content-Type header.
charset – Default charset for headers and text fields.
strict – Enables additional format and sanity checks.
buffer_size – Chunk size when reading from the source stream.
header_limit – Maximum number of headers per part.
headersize_limit – Maximum length of a single header line (name and value).
part_limit – Maximum number of parts.
partsize_limit – Maximum content size of a single part.
spool_limit – Parts up to this size are buffered in memory and count towards memory_limit. Larger parts are spooled to temporary files on disk and count towards disk_limit.
memory_limit – Maximum size of all memory-buffered parts. Should be smaller than spool_limit * part_limit to have an effect.
disk_limit – Maximum size of all disk-buffered parts.

__iter__()¶: Parse the multipart stream and yield MultipartPart instances as soon as they are available.

parts()¶: Parse the entire multipart stream and return all MultipartPart instances as a list.

get(name, default=None)¶: Return the first part with a given name, or the default value if no matching part exists.

get_all(name)¶: Return all parts with the given name.

class multipart.MultipartPart¶

A MultipartPart represents a fully parsed multipart part and provides convenient access to part headers and other details (e.g. name and filename) as well as its memory- or disk-buffered binary or text content.

__init__(segment: MultipartSegment, buffer_size=65536, memfile_limit=262144, charset='utf8')¶: Private constructor, used by MultipartParser

file: BytesIO | BufferedRandom | None¶: A file-like buffer holding the parts binary content, or None if this part was closed.

size¶: Part size in bytes.

name¶: Part name.

filename¶: Part filename (if defined).

charset¶: Charset as defined in the part header, or the parser default charset.

headerlist¶: All part headers as a list of (name, value) pairs.

property headers: Headers¶: A convenient dict-like holding all part headers.

property disposition: str¶: The value of the Content-Disposition part header.

property content_type: str¶: Cleaned up content type provided for this part, or a sensible default (application/octet-stream for files and text/plain for text fields).

is_buffered()¶: Return true if file is memory-buffered, or false if the part was larger than the spool_limit and content was spooled to temporary files on disk.

property value¶

Return the entire payload as a decoded text string.

Warning, this may consume a lot of memory, check size first.

property raw¶

Return the entire payload as a raw byte string.

Warning, this may consume a lot of memory, check size first.

save_as(path)¶: Save a copy of this part to path and return the number of bytes written.

close()¶: Close file and set it to None to free up resources.

WSGI Helper¶

multipart.is_form_request(environ)¶: Return True if the environ represents a form request that can be parsed with parse_form_data(). Checks for a compatible Content-Type header.

multipart.parse_form_data(environ, charset='utf8', strict=False, ignore_errors=None, **kwargs)¶

Parses both types of form data (multipart and url-encoded) from a WSGI environment and returns two MultiDict instances, one for text form fields (strings) and one for file uploads (MultipartPart instances). Text fields that are too big to fit into memory limits are treated as file uploads with no filename.

The default limits for MultipartParser apply, but can be overridden via keyword arguments. For url-encoded requests, only memory_limit and part_limit have an effect. They have the same defaults and meaning as with MultipartParser and limit the total size and the maximum number of form fields to parse.

Parameters:

environ – A WSGI environment dictionary. Only wsgi.input, CONTENT_TYPE and CONTENT_LENGTH are used.
charset – The default charset used to decode headers and text fields.
strict – Enables additional format and sanity checks.
ignore_errors – If True, suppress all exceptions. The returned results may be empty or incomplete. If False, then exceptions are not suppressed. A value of None (default) throws exceptions in strict mode but suppresses errors in non-strict mode.
kwargs – Additional keyword arguments (e.g. limits) passed to the MultipartParser.

Raises:

MultipartError – See ignore_errors parameter.

Header parsing¶

multipart.parse_options_header(header, options=None, unquote=<function header_unquote>)¶

Parse Content-Type (or similar) headers into a primary value and an options-dict.

Note: For Content-Disposition headers you need a different unquote function. See content_disposition_unquote().

multipart.header_quote(val)¶

Quote header option values if necessary.

Note: This is NOT the way modern browsers quote field names or filenames in Content-Disposition headers. See content_disposition_quote()

multipart.header_unquote(val, filename=False)¶

Unquote header option values.

Note: This is NOT the way modern browsers quote field names or filenames in Content-Disposition headers. See content_disposition_unquote()

multipart.parse_content_disposition(value: str) → tuple[str, str | None, str | None]¶

Specialized parser for standard multipart Content-Disposition headers.

Returns a (disposition, name, filename) tuple. For multipart the disposition value should be 'form-data', but this is not enforced. name and filename can be None when the corresponding header parameter is missing. Additional parameters are ignored.

Parameter values are decoded with content_disposition_unquote() if necessary, but are not otherwise validated or sanitized.

multipart.content_disposition_quote(val)¶: Quote field names or filenames for Content-Disposition headers the same way modern browsers do it (see WHATWG HTML5 specification).

multipart.content_disposition_unquote(val, filename=False)¶

Unquote field names or filenames from Content-Disposition headers.

Legacy quoting mechanisms are detected to some degree and also supported, but there are rare ambiguous edge cases where we have to guess. If in doubt, this function assumes a modern browser and follows the WHATWG HTML5 specification (limited percent-encoding, no backslash-encoding).

If ‘filename’ is true, additional windows/ie6 legacy workarounds are applied.

Utilities¶

class multipart.MultiDict¶

A dict that stores multiple values per key. Most dict methods return the last value by default. There are special methods to get all values.

__init__(*args, **kwargs)¶

keys() → a set-like object providing a view on D's keys¶

append(key, value)¶: Add an additional value to a key.

replace(key, value)¶: Replace all values for a key with a single value.

getall(key)¶: Return a list with all values for a key. The list may be empty.

get(k[, d]) → D[k] if k in D, else d. d defaults to None.¶

iterallitems()¶: Yield (key, value) pairs with repeating keys for each value.

Exceptions¶

exception multipart.MultipartError¶: Base class for all parser errors or warnings

exception multipart.ParserError¶: Detected invalid input

exception multipart.StrictParserError¶: Detected unusual input while parsing in strict mode

exception multipart.ParserLimitReached¶: Parser reached one of the configured limits

exception multipart.ParserStateError¶: Parser is used incorrectly (e.g. use after close)

Multipart

Navigation

Related Topics

API Reference¶

Non-Blocking Parser¶

Buffered Parser¶

WSGI Helper¶

Header parsing¶

Utilities¶

Exceptions¶