Download kdb ticker plan review and more Exercises Computer Science in PDF only on Docsity!
Kdb+ ticker-plant interview questions
Q. Describe a typical ticker-plant set up.
A. - A feed-handler to provide real time data.
- A ticker-plant to capture and log all the incoming data.
- A log file connected to the ticker-plant to recover data in the event of a disaster.
- Real time database to store the current day’s data in memory and write it to the historical database at the
end of day. Various RTS.
- A historical database to access all data prior to the current day.
Q. Describe how a ticker-plant operates?
A. Data from the data feed is parsed by the feed-handler and converted into kdb+ messages. The feed-handler
then sends this data to the ticker-plant. The TP publishes this data to the log file & populates its own
internal tables. The TP then publishes the data from its tables to the RDB and to any real time subscribers.
After the clients have been updated then the TP discards that data as it does not store any data but merely
acts as a gateway. The RDB stores the data from that day and accepts queries. At EOD the RDB saves the
data to the HDB, purges its tables & creates a new Log file. It then tells the HDB to read the new data.
Q. What is a chained ticker-plant?
A. A chained ticker-plant is a real time subscriber subscribed to the master TP that in turn has real time
subscribers which it publishes data to. The use of chained ticker-plants reduces the latency of data through
the paths of the system. The master TP can be zero-latency, one of its subscribers can be another TP which
publishes every 100ms, this in turn has another TP as its subscriber which publishes every 1s. This way a
client can subscribe to a TP with granularity that suits their needs.
Q. What is a feed-handler?
A. The feed handler is specialized Kdb+ process which connects to a data feed. It retrieves and converts the
data from the feed specific format into a Kdb+ message which is then published to the ticker-plant process.
Q. In what language does the feed-handler receive the data from the data feed?
A. Usually a compiled language such as C, C++, java or C#.
Q. What function does the feed-handler use to send data to the ticker-plant?
A. The feed-handler calls .u.upd to publish data to the ticker plant.
Q. What does –w do?
A. It specifies the maximum space that an RDB can take up before it shuts down. Usually this is set to 4 times
the size of the expected data feed, in case the EOD function doesn’t work for some reason & to then ensure
there is enough allowable space for tomorrows data.
Q. What would you do if your ticker-plant goes down?
A. The usual solution to this is to have a complete mirror of the production system. The switch from disaster to
recovery systems can be done using IPC
Q. What would you do if an RDB fails?
A. Restart the RDB which will replay the log file and bring itself back up to speed.
Q. What is the difference between batching & zero-latency mode?
A. With batching mode the TP temporarily stores data in its tables and publishing to the clients occurs on the
next tick via the timer. In zero-latency mode there the TP doesn’t store any data and publishing to the
clients occurs every time a message is received from the feed-handler via the upd function.
Q. Why would you choose batching or zero-latency mode?
A. Zero-latency mode is used where the client needs every update as soon as it arrives, such as high frequency
trading, where every nano-second can give you an advantage. However publishing is expensive in terms of
time. For this reason, clients such as GUI’s and human traders would choose batching mode, where updates
every second is much more suitable, and more efficient on the TP.
Q. How would you choose batching mode?
A. You can choose to run in batching mode by setting the timer at the command line when starting the TP. The
.u.ts function will have an if statement checking to see whether “system t” is set.
Q. How can you check if a log file exists?
A. Check if the key of .u.L returns an empty list. If yes, it doesn’t exist.
Q. In what format does the TP receive data from the feed-handler?
A. In column orientated lists - meaning all the values for the individual columns are together.
Q. How does the TP create a q table from the data it receives?
A. A line of logic in the u.upd function gets the columns of the table from the TP schema and creates a
dictionary with these columns as the key, with the corresponding data as the values.
Q. How would you tell if a RTS was hogging memory or acting slow?
A. Check the .z.W dictionary. The keys will be the connections to the TP and the values will show the number
of outstanding bytes waiting in queue for each.
Ticker-plant functions:
.u.tick: First function to be called when the TP starts up. ● Executed where: standalone file ● Purposes: 1 Execute .u.init 2 Verify that all tables on the TP have time, sym as their first two cols 3 Apply grouped attribute to sym columns of all tables 4 Set .u.d to current date 5 Execute .u.ld to create TP log file ● Args: 1 Name of schema file 2 Directory of log file and HDB
.u.init: ● Executed where: within .u.tick ● Purposes: 1 Define the list of tables (.u.t) which can be subscribed to 2 Define the dictionary (.u.w) which maps each table name to the related subscriber handle ● Args: None
.u.ld: ● Executed where: within .u.end of day ● Purposes: 1 Create ticker-plant log file and establish connection ● Args: 1 .u.d
.u.sub: ● Executed where: As a result of a remote synchronous call from an RTS/RDB ● Purposes: 1 Connects RTS/RDB to TP 2 Subscribes to specified tables & specified syms ● Args: 1 Table name(s) 2 Required sym(s) ● Result: The calling RTS will be added to .u.w. It will return the appropriate empty schemas to the RTS along with location and number of messages (.u.i) in the Log file. The RTS/RDB will then replay the first .u.i messages to get up to speed
.u.del: ● Executed where: Within .u.sub ● Purposes: 1 Clear out any pre-existing subscription from the new RTS to the table they’ve subscribed to within .u.w ● Args: 1 Table name(s) 2 Callers handle
.u.add: ● Executed where: Within .u.sub ● Purposes: 1 Modifies .u.w with the new subscription 2 Returns an empty copy of the relevant table to the new subscriber ● Args: 1 Table name(s) 2 Required sym(s) ● Result: The calling RTS will be added to .u.w. It will return the appropriate empty schemas to the RTS.
.u.sel: ● Executed where: Within .u.pub ● Purposes: 1 Will grab whatever subset of table that the RTS cares about ● Args: 1 Full contents of table 2 Required sym(s)
.u.pub: ● Executed where: Within .z.ts (if publishing on timer) or within .u.upd (if publishing on every update) ● Purposes: 1 Publishes out the relevant rows of the input tables to all interested RTS ● Args: 1 Table name 2 Current contents
.u.end: ● Executed where: Within .u.endofday ● Purposes: 1 Sends async message to each RTS & the RDB to execute their individual .u.end functions ● Args: 1 Yesterday’s date
.u.endofday: ● Executed where: Within .u.ts if gone past EOD ● Purposes: 1 Send identical message to all subscribers telling them to execute their EOD function .u.end 2 Increment current date .u.d 3 Close connection to old log file and establish connection to new log file ● Args: None
.u.ts: ● Executed where: Within timer function .z.ts ● Purposes: 1 Checks to see if we have gone past midnight by testing against .u.d ● Args: 1 The (new) current date
RDB functions:
.u.end: ● Executed where: Will execute at EOD as a result of ticker-plant telling it to ● Purposes: - 1 Filter out any tables which don’t have the #g attribute applied to its sym columns 2 It uses the .Q.hdpf function which saves all tables by calling .Q.dpft, clears tables, and send reload message to HDB. 3 Will apply the#g attribute back to each table ● Args: None
.u.rep: ● Executed where: On a standalone line when the RDB starts up ● Purposes: - 1 Initialise the TP tables on the RDB to be empty 2 Uses -11! to replay TP log file as necessary 3 CD to the top level of the HDB# ● Args: 1 Initial burst of data from the subscription to the tickerplant 2 .u.i and .u.L
Log Files & replaying:
Q. What is u.L?
A. .u.L is the path to the TP log file
Q. What is .u.l (lowercase L)?
A. .u.l is the handle to the TP log file. It is set as the end result of .u.ld function.
Q. How can you check if a log file has been initialised?
A. Check if the key of .u.L is an empty list, if it is it means the file doesn’t exist.
Q. How do you initialise a log file?
A. .u.L set (). Set the path to the log file to be an empty list
Q. How does the TP send messages to the log file?
A. The last step in the upd function checks to see if lowercase l, the handle to log file exists. If yes, it uses the
handle and an enlisted message containing 3 items, the function itself, the table name, and the data to be
inserted, to send the message to the log file.
Q. What is contained in a log file?
A. In general, a log file will contain lists, each list with the first item as the function and the rest of the items its
arguments.
Q. In what format are the entries in a log file in a TP architecture?
A. Within the log file the messages are in binary format, but if we were to get any of the messages in memory
they would be 3 item lists – the upd function, the table name and finally the data to be written.
Q. How does an RDB replay a log file?
A. It uses -11!
Q. What does -11! do when called upon?
A. It reads each line of the log file and runs the function with the tablename and tabledata parameters in turn.
Q. How does the RDB use -11!? What are the steps?
A. Using the .u.rep function defined in the RDB script. It takes two arguments, first the empty tables from the
TP, secondly a two item list of (.u.i and .u.L). It checks to see if .u.i is null, if yes, return early, if not run -11!
with the two item list of (.u.i and .u.L) as its arguments.
Q. How does .u.rep get these two arguments?
A. A line in the RDB script uses. and a two item list. The first item in the list opens up a file handle to the TP
and uses this handle to subscribe to all tables and syms on the TP. The TP returns the data. This is the first
argument. The second item in the list just gets .u.i and .u.L. in a two item list. This is the second argument.
Q. What are the three overloads of -11!?
A. 1. -11! (logFilePath) – this will replay the whole logFile into memory
2. -11! (N ; logFilePath) – this will replay the first N records and is the overload used in kdb+ tick using
-11! (.u.i ; .u.L)
3. -11! (-2 ; logFilePath) – this will get the count of all the messages in the logfile.