Types conversions

Data types supported by q and Python are incompatible and thus require additional translation. This page describes default rules used for converting data types between q and Python.

The translation mechanism used in qPython library is designed to:
  • deserialized message from kdb+ can be serialized and send back to kdb+ without additional processing,
  • end user can enforce type hinting for translation,
  • efficient storage for tables and lists is backed with numpy arrays.

Default type mapping can be overriden by using custom IPC serializers or deserializers implementations.

Atoms

While parsing IPC message atom q types are translated to Python types according to this table:

q type q num type Python type
bool -1 numpy.bool_
guid -2 UUID
byte -4 numpy.byte
short -5 numpy.int16
integer -6 numpy.int32
long -7 numpy.int64
real -8 numpy.float32
float -9 numpy.float64
character -10 single element str
timestamp -12 QTemporal  numpy.datetime64   ns
month -13 QTemporal  numpy.datetime64   M
date -14 QTemporal  numpy.datetime64   D
datetime -15 QTemporal  numpy.datetime64   ms
timespan -16 QTemporal  numpy.timedelta64  ns
minute -17 QTemporal  numpy.timedelta64  m
second -18 QTemporal  numpy.timedelta64  s
time -19 QTemporal  numpy.timedelta64  ms

Note

By default, temporal types in Python are represented as instances of qtemporal.QTemporal wrapping over numpy.datetime64 or numpy.timedelta64 with specified resolution. This setting can be modified (numpy_temporals = True) and temporal types can be represented without wrapping.

During the serialization to IPC protocol, Python types are mapped to q as described in the table:

Python type q type q num type
bool bool -1
byte -4
short -5
int int -6
long long -7
real -8
double float -9
numpy.bool bool -1
numpy.byte byte -4
numpy.int16 short -5
numpy.int32 int -6
numpy.int64 long -7
numpy.float32 real -8
numpy.float64 float -9
single element str character -10
QTemporal  numpy.datetime64   ns timestamp -12
QTemporal  numpy.datetime64   M month -13
QTemporal  numpy.datetime64   D date -14
QTemporal  numpy.datetime64   ms datetime -15
QTemporal  numpy.timedelta64  ns timespan -16
QTemporal  numpy.timedelta64  m minute -17
QTemporal  numpy.timedelta64  s second -18
QTemporal  numpy.timedelta64  ms time -19

Note

By default, single element strings are serialized as q characters. This setting can be modified (single_char_strings = True) and and single element strings are represented as q strings.

String and symbols

In order to distinguish symbols and strings on the Python side, following rules apply:

  • q symbols are represented as numpy.string_ type,
  • q strings are mapped to plain Python strings in Python 2 and bytes in Python 3.
# Python 2
# `quickbrownfoxjumpsoveralazydog
<type 'numpy.string_'>
numpy.string_('quickbrownfoxjumpsoveralazydog')

# "quick brown fox jumps over a lazy dog"
<type 'str'>
'quick brown fox jumps over a lazy dog'

# Python 3
# `quickbrownfoxjumpsoveralazydog
<class 'numpy.bytes_'>
b'quickbrownfoxjumpsoveralazydog'

# "quick brown fox jumps over a lazy dog"
<class 'bytes'>
b'quick brown fox jumps over a lazy dog'

Note

By default, single element strings are serialized as q characters. This setting can be modified (single_char_strings = True) and and single element strings are represented as q strings.

>>> # serialize single element strings as q characters
>>> print(q.sync('{[x] type each x}', ['one', 'two', '3'], single_char_strings = False))
[ 10,  10, -10]

>>> # serialize single element strings as q strings
>>> print(q.sync('{[x] type each x}', ['one', 'two', '3'], single_char_strings = True))
[10, 10, 10]

Lists

qPython represents deserialized q lists as instances of qcollection.QList are mapped to numpy arrays.

# (0x01;0x02;0xff)
qlist(numpy.array([0x01, 0x02, 0xff], dtype=numpy.byte))

# <class 'qpython.qcollection.QList'>
# numpy.dtype: int8
# meta.qtype: -4
# str: [ 1  2 -1]

Generic lists are represented as a plain Python lists.

# (1;`bcd;"0bc";5.5e)
[numpy.int64(1), numpy.string_('bcd'), '0bc', numpy.float32(5.5)]

While serializing Python data to q following heuristic is applied:

  • instances of qcollection.QList and qcollection.QTemporalList are serialized according to type indicator (meta.qtype):

    qlist([1, 2, 3], qtype = QSHORT_LIST)
    # (1h;2h;3h)
    
    qlist([366, 121, qnull(QDATE)], qtype=QDATE_LIST)
    # '2001.01.01 2000.05.01 0Nd'
    
    qlist(numpy.array([uuid.UUID('8c680a01-5a49-5aab-5a65-d4bfddb6a661'), qnull(QGUID)]), qtype=QGUID_LIST)
    # ("G"$"8c680a01-5a49-5aab-5a65-d4bfddb6a661"; 0Ng)
    
  • numpy arrays are serialized according to type of their dtype value:

    numpy.array([1, 2, 3], dtype=numpy.int32)
    # (1i;2i;3i)
    
  • if numpy array dtype is not recognized by qPython, result q type is determined by type of the first element in the array,

  • Python lists and tuples are represented as q generic lists:

    [numpy.int64(42), None, numpy.string_('foo')]
    (numpy.int64(42), None, numpy.string_('foo'))
    # (42;::;`foo)
    

Note

numpy arrays with dtype==|S1 are represented as atom character.

qPython provides an utility function qcollection.qlist() which simplifies creation of qcollection.QList and qcollection.QTemporalList instances.

The qtype module defines QSTRING_LIST const which simplifies creation of string lists:

qlist(numpy.array(['quick', 'brown', 'fox', 'jumps', 'over', 'a lazy', 'dog']), qtype = QSTRING_LIST)
qlist(['quick', 'brown', 'fox', 'jumps', 'over', 'a lazy', 'dog'], qtype = QSTRING_LIST)
['quick', 'brown', 'fox', 'jumps', 'over', 'a lazy', 'dog']
# ("quick"; "brown"; "fox"; "jumps"; "over"; "a lazy"; "dog")

Note

QSTRING_LIST type indicator indicates that list/array has to be mapped to q generic list.

Temporal lists

By default, lists of temporal values are represented as instances of qcollection.QTemporalList class. This class wraps the raw q representation of temporal data (i.e. longs for timestamps, ints for months etc.) and provides accessors which allow to convert raw data to qcollection.QTemporal instances in a lazy fashion.

>>> v = q.sync("2001.01.01 2000.05.01 0Nd", numpy_temporals = False)
>>> print('%s dtype: %s qtype: %d: %s' % (type(v), v.dtype, v.meta.qtype, v))
<class 'qpython.qcollection.QTemporalList'> dtype: int32 qtype: -14: [2001-01-01 [metadata(qtype=-14)] 2000-05-01 [metadata(qtype=-14)]
 NaT [metadata(qtype=-14)]]

>>> v = q.sync("2000.01.04D05:36:57.600 0Np", numpy_temporals = False)
>>> print('%s dtype: %s qtype: %d: %s' % (type(v), v.dtype, v.meta.qtype, v))
<class 'qpython.qcollection.QTemporalList'> dtype: int64 qtype: -12: [2000-01-04T05:36:57.600000000+0100 [metadata(qtype=-12)]
 NaT [metadata(qtype=-12)]]

The IPC parser (qreader.QReader) can be instructed to represent the temporal vectors via numpy.datetime64 or numpy.timedelta64 arrays wrapped in qcollection.QList instances. The parsing option can be set either via QConnection constructor or as parameter to functions: (sync()) or (receive()).

>>> v = q.sync("2001.01.01 2000.05.01 0Nd", numpy_temporals = True)
>>> print('%s dtype: %s qtype: %d: %s' % (type(v), v.dtype, v.meta.qtype, v))
<class 'qpython.qcollection.QList'> dtype: datetime64[D] qtype: -14: ['2001-01-01' '2000-05-01' 'NaT']

>>> v = q.sync("2000.01.04D05:36:57.600 0Np", numpy_temporals = True)
>>> print('%s dtype: %s qtype: %d: %s' % (type(v), v.dtype, v.meta.qtype, v))
<class 'qpython.qcollection.QList'> dtype: datetime64[ns] qtype: -12: ['2000-01-04T05:36:57.600000000+0100' 'NaT']

In this parsing mode, temporal null values are converted to numpy.NaT.

The serialization mechanism (qwriter.QWriter) accepts both representations and doesn’t require additional configuration.

There are two utility functions for conversions between both representations:

Dictionaries

qPython represents q dictionaries with custom qcollection.QDictionary class.

Examples:

QDictionary(qlist(numpy.array([1, 2], dtype=numpy.int64), qtype=QLONG_LIST),
            qlist(numpy.array(['abc', 'cdefgh']), qtype = QSYMBOL_LIST))
# q: 1 2!`abc`cdefgh


QDictionary([numpy.int64(1), numpy.int16(2), numpy.float64(3.234), '4'],
            [numpy.string_('one'), qlist(numpy.array([2, 3]), qtype=QLONG_LIST), '456', [numpy.int64(7), qlist(numpy.array([8, 9]), qtype=QLONG_LIST)]])
# q: (1;2h;3.234;"4")!(`one;2 3;"456";(7;8 9))

The qcollection.QDictionary class implements Python collection API.

Tables

The q tables are translated into custom qcollection.QTable class.

qPython provides an utility function qcollection.qtable() which simplifies creation of tables. This function also allow user to override default type conversions for each column and provide explicit q type hinting per column.

Examples:

qtable(qlist(numpy.array(['name', 'iq']), qtype = QSYMBOL_LIST),
      [qlist(numpy.array(['Dent', 'Beeblebrox', 'Prefect'])),
       qlist(numpy.array([98, 42, 126], dtype=numpy.int64))])

qtable(qlist(numpy.array(['name', 'iq']), qtype = QSYMBOL_LIST),
      [qlist(['Dent', 'Beeblebrox', 'Prefect'], qtype = QSYMBOL_LIST),
       qlist([98, 42, 126], qtype = QLONG_LIST)])

qtable(['name', 'iq'],
       [['Dent', 'Beeblebrox', 'Prefect'],
        [98, 42, 126]],
       name = QSYMBOL, iq = QLONG)

# flip `name`iq!(`Dent`Beeblebrox`Prefect;98 42 126)


qtable(('name', 'iq', 'fullname'),
       [qlist(numpy.array(['Dent', 'Beeblebrox', 'Prefect']), qtype = QSYMBOL_LIST),
        qlist(numpy.array([98, 42, 126]), qtype = QLONG_LIST),
        qlist(numpy.array(["Arthur Dent", "Zaphod Beeblebrox", "Ford Prefect"]), qtype = QSTRING_LIST)])

# flip `name`iq`fullname!(`Dent`Beeblebrox`Prefect;98 42 126;("Arthur Dent"; "Zaphod Beeblebrox"; "Ford Prefect"))

The keyed tables are represented by qcollection.QKeyedTable instances, where both keys and values are stored as a separate qcollection.QTable instances.

For example:

# ([eid:1001 1002 1003] pos:`d1`d2`d3;dates:(2001.01.01;2000.05.01;0Nd))
QKeyedTable(qtable(['eid'],
                   [qlist(numpy.array([1001, 1002, 1003]), qtype = QLONG_LIST)]),
            qtable(['pos', 'dates'],
                   [qlist(numpy.array(['d1', 'd2', 'd3']), qtype = QSYMBOL_LIST),
                    qlist(numpy.array([366, 121, qnull(QDATE)]), qtype = QDATE_LIST)]))

Functions, lambdas and projections

IPC protocol type codes 100+ are used to represent functions, lambdas and projections. These types are represented as instances of base class qtype.QFunction or descendent classes:

  • qtype.QLambda - represents q lambda expression, note the expression is required to be either:

    • q expression enclosed in {}, e.g.: {x + y}
    • k expression, e.g.: k){x + y}
  • qtype.QProjection - represents function projection with parameters:

    # { x + y}[3]
    QProjection([QLambda('{x+y}'), numpy.int64(3)])
    

Note

Only qtype.QLambda and qtype.QProjection are serializable. qPython doesn’t provide means to serialize other function types.

Errors

The q errors are represented as instances of qtype.QException class.

Null values

Please note that q null values are defined as:

_QNULL1 = numpy.int8(-2**7)
_QNULL2 = numpy.int16(-2**15)
_QNULL4 = numpy.int32(-2**31)
_QNULL8 = numpy.int64(-2**63)
_QNAN32 = numpy.fromstring('\x00\x00\xc0\x7f', dtype=numpy.float32)[0]
_QNAN64 = numpy.fromstring('\x00\x00\x00\x00\x00\x00\xf8\x7f', dtype=numpy.float64)[0]
_QNULL_BOOL = numpy.bool_(False)
_QNULL_SYM = numpy.string_('')
_QNULL_GUID = uuid.UUID('00000000-0000-0000-0000-000000000000')

Complete null mapping between q and Python is represented in the table:

q type q null value Python representation
bool 0b _QNULL_BOOL
guid 0Ng _QNULL_GUID
byte 0x00 _QNULL1
short 0Nh _QNULL2
int 0N _QNULL4
long 0Nj _QNULL8
real 0Ne _QNAN32
float 0n _QNAN64
string " " ' '
symbol ` _QNULL_SYM
timestamp 0Np _QNULL8
month 0Nm _QNULL4
date 0Nd _QNULL4
datetime 0Nz _QNAN64
timespan 0Nn _QNULL8
minute 0Nu _QNULL4
second 0Nv _QNULL4
time 0Nt _QNULL4

The qtype provides two utility functions to work with null values:

  • qnull() - retrieves null type for specified q type code,
  • is_null() - checks whether value is considered a null for specified q type code.

Custom type mapping

Default type mapping can be overwritten by providing custom implementations of QWriter and/or QReader and proper initialization of the connection as described in Custom IPC protocol serializers/deserializers.

QWriter and QReader use parse time decorator (Mapper) which generates mapping between q and Python types. This mapping is stored in a static variable: QReader._reader_map and QWriter._writer_map. In case mapping is not found in the mapping:

  • QWriter tries to find a matching qtype in the ~qtype.Q_TYPE dictionary and serialize data as q atom,
  • QReader tries to parse lists and atoms based on the type indicator in IPC stream.

While subclassing these classes, user can create copy of the mapping in the parent class and use parse time decorator:

class PandasQWriter(QWriter):
    _writer_map = dict.copy(QWriter._writer_map)    # create copy of default serializer map
    serialize = Mapper(_writer_map)                 # upsert custom mapping

    @serialize(pandas.Series)
    def _write_pandas_series(self, data, qtype = None):
        # serialize pandas.Series into IPC stream
        # ..omitted for readability..
        self._write_list(data, qtype = qtype)


class PandasQReader(QReader):
    _reader_map = dict.copy(QReader._reader_map)   # create copy of default deserializer map
    parse = Mapper(_reader_map)                    # overwrite default mapping

    @parse(QTABLE)
    def _read_table(self, qtype = QTABLE):
        # parse q table as pandas.DataFrame
        # ..omitted for readability..
        return pandas.DataFrame(data)

Refer to Custom type IPC deserialization for complete example.