




Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
kdb q review questions and answers
Typology: Quizzes
1 / 8
This page cannot be seen from the preview
Don't miss anything!





Datatypes: atoms, lists, dictionaries, tables (keyed and unkeyed) ● Atoms : all data is ultimately built from atoms. An atom is an irreducible value with a specific data type. (Boolean, short, int, long, symbol, time) ● Lists : data complexity is built up from atoms. A list is an ordered collection of atoms and other lists. The order of the items in the list is positional ● Dictionaries : generalizations of lists and provide the foundation for tables. A dictionary is an ordered collection of key-value pairs that is hashable. (key!value); type: 99h ● Tables : collection of named columns implemented as a dictionary. Only effect of flipping the column dictionary is to reverse the order of its indices; no data is rearranged under the covers: type 98h ([] c1:L1)
Amending Lists ● Assignment L[1]: ● Count L ● Joining with , ● Merging with ^ etc …
Simple lists and list at depth ● Nested lists have item(s) that are themselves lists; the number of levels of nesting for a list is called its depth. Atoms are depth 0 and simple list have depth 1.
Use of @ and. amend ●. o q) d:((1 2 3;4 5 6 7);(8 9;10;11 12);(13 14;15 16 17 18;19 20)) o q) d. 1 2 / same as d[1;2] 11 12 o q) d. 1 2 0 / same as d[1;2;0] 11 o q) d. enlist 1 / same as d[1] (8 9; 10;11 12) ● @ o Used for indexing at the top level o q) d:((1 2 3;4 5 6 7);(8 9;10;11 12);(13 14;15 16 17 18;19 20)) o q) d@ (8 9;10;11 12) o q)d@1 2 / select 2 items at the top level ((8 9; 10;11 12); (12 14;15 16 17 18; 19 20))
Joins ● the essence of relational database design is normalizing data using relations and keys and then reassembling with joins. Normalization eliminates duplicate data, which takes up space and is hard to keep consistent ● aj (Asof join) o joins tables along common column using most recent values o aj[c1…cn;t1;t2]
o q)show t:([] ti:10:01:01 10:01:03 10:01:04;sym:msftibm`ge;qty: 200 150) ti sym qty
10:01:01 msft 100 10:01:03 ibm 200 10:01:04 ge 150 o show q:([] ti:10:01:00 10:01:01 10:01: 10:01:02;sym:ibmmsftmsftibm;px:100 99 101 98) ti sym px
10:01:00 ibm 100 10:01:01 msft 99 10:01:01 msft 101 10:01:02 ibm 98 o q)aj[symti;t;q] ti sym qty px
10:01:01 msft 100 101 10:01:03 ibm 200 98 10:01:04 ge 150 ● lj (ad hoc Left Join) o q)t:([] k:1 2 3 4; c:10 20 30 40) o q)kt:([k:2 3 4 5]; v:200 300 400 500) o q)t lj kt // if in t col name was v, priority on kt k c v
1 10 2 20 200 3 30 300 4 40 400 ● ij (inner join) o q)t:([] k:1 2 3 4; v:10 20 30 40) o q)kt:([k:2 3 4 5]; v:200 300 400 500) o q)t ij kt k v
2 200 3 300 4 400
Functions and projections ● a function is a sequence of expressions to be evaluated, having optional parameters and a return value o q) f:{[x] x*x} o the notation for function definition is: ▪ {[p1;…;pn] e1; …;en} ● Function Projections: sometimes a function of valence two or more is evaluated repeatedly while some of its arguments are held constant. For this situation, a multivalent function can have one or more arguments fixed and the result is a function of lower valence called the
Qsql statement and table manipulation ● Q has a collection of functions for manipulating tables that are similar to their counterparts in SQL- qsql.(insert, select ,update..) ● Difference: o Q table has well-defined record and column orders o Q table is stored physically as a collection of column vectors- this means that operations on column data are easy and fast since atomic, aggregate or uniform functions applied to columns are optimized vector operations o Upsert semantics (one dataset can be applied to another without the need to separate inserts from updates) ● SELECT o Select from texp o Select from tk o Result is a table o Order of execution: from, where, by, select o Select phrase controls which columns appear in the result ● EXEC o Syntax is identical to that of select o Return variable is either list(1 column return) or dictionary(2 or more columns returned) ● UPDATE o Syntax identical to select o If column exist –update, if not create one o The original table is not affected// use backtick to update the original table ● DELETE o Allows either rows or columns to be deleted o Syntax simplified from select ▪ Delete from texp ▪ Delete in place ● Delete from t where c2= **● INSERT, UPSERT o INSERT** o q)t:([] name:symbol$(); iq:int$()) o q)t,:nameiq!(Beelbrox; 42) o -------------------------------------------------------------- o t:([] name:DentBeeblebroxPrefect; iq:42 98 126) ot insert (Marvin; 150) o Repeatedly applying insert to a table with the same argument results in **duplicate records** o Insert into keyed tables ▪t insert (1004; Slartibartfast; 158) **o UPSERT ▪** Like INSERT, only better. ot upsert (Marvin; 150) o upsert[t; (Slartibartfast; 134)] okt upsert (1001; `Beeblebrox; 43)
▪ We have seen that insert has undesirable semantics on keyed tables –i.e., it rejects “duplicate “keys. What we really want is, upsert semantics. ● FBY o in where o a common use case in a query is to filter on groups. ▪ q)select max weight by city from p o But you run into the issue of what to do with the other fields in the sub-query? Do without aggregation , which loses the information of where the maximum occurs? o Fby take the form: ▪ (fagg;exprcol) fby c ▪ Aggregate function and column expression on which the function will be applied, the right operand c is a symbol containing the name of the column whose values are grouped to form lists for the aggregate function. ▪ tdetails:([] eid:`tk$1003 1002 1001 1002 1001 1002; sc:126 36 92 39 98 42) ▪ Check this for details: link ▪ Aggregation function does not return Boolean so we cannot use them in where statement ● select distinct eid from tdetails where 1 Adverbs ● Kdb provides adverbs which can alter function and verb behaviour to improve efficiency and keep code concise. Employing adverbs correctly can bypass the need for multiple loops and conditionals with significant performance enhancements. ● Monadic : each-both, each-prior, scan, over ● Dyadic : each-both; each-prior; each-right; each-left; scan; over ● EACH : o f each o count (10 20 30; 40 50) 2 o count each (10 20 30; 40 50) 3 2 ● EACH-BOTH o (1,2,3),’(10;20;30) ● EACH LEFT o Modifies each left argument with the second argument o ("Now";"is";"the";"time"),: ", " "Now, " "is, " "the, " "time, " ● EACH RIGHT o Applies the entire first argument to each item of the second argument o " ," ,/: ("Now";"is";"the";"time") " ,Now" " ,is" " ,the" " ,time" ● OVER (/) o Modifies a base dyadic function so that the items of the second argument are applied iteratively to the first argument o */[7 6 5 4 3 2 1] 5040 ● SCAN () o The same as over, but it will return the results of each iteration o *[7 6 5 4 3 2 1] 7 42 210 840 2520 5040 5040
● EACH-PREVIOUS (‘:) o Modifies a base dyadic function so that each item of the right operand is applied to its predecessor. o 0+’:1 2 3 4 5 1 3 5 7 9
Historical Data ● Splayed tables o For larger tables that may not fit into memory on all machines, you can ask q to serialize each column of the table to its own file in a specified directory. A table persisted in this form is called a splayed table
o Advantage is that when querying a splayed table, only the columns referred to in the query will be loaded into memory o This saves the table splayed ▪ q):dirname/ set table ▪ order of the columns is stored in the hidden .d file o limitations: ▪ all column must be simple or compound list ▪ symbol columns must be enumerated ▪ keyed tables no o you can read elements with get ▪ get:/db/t/v o you can query on them ▪ select from t o Appending to a Splayed table: ▪ Upsert/ insert does not work ▪ And make rows into tables o Removing a column from splayed table: ▪ Access .d file and remove/rearagne the .d file ▪ :/db/t/.d set get[:/db/t/.d] except ‘s o Example: ▪ :/db/t/ set ([] ti:09:30:00 09:31:00; p:101.5 33.5) **:/db/t/** ▪ get :/db/t/.d **tip** ▪ get:/db/t/ti 09:30:00 09:31: ▪ get :/db/t/p **101.5 33.** o To append a record (or records) to the persisted image of a splayed table, use its file handle with upsert or the equivalent functional form. o Column update: ▪ File hande using functional @ e.g. @[get:/db/t/p;1;:;42.] o Reorder columns ▪ :/db/t/.d setp`ti