




























Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
First,. Bing Ads (2019) [5] states that adgroups are the best way to organize advertising campaigns. In particular, adgroup allows advertisers to better track ...
Typology: Exams
1 / 36
This page cannot be seen from the preview
Don't miss anything!





























Huiran Li and Yanwu Yang
School of Management, Huazhong University of Science and Technology
Abstract: In sponsored search advertising, advertisers need to make a series of keyword decisions.
Among them, how to group these keywords to form several adgroups within a campaign is a
challenging task, due to the highly uncertain environment of search advertising. This paper
proposes a stochastic programming model for keywords grouping, taking click-through rate and
conversion rate as random variables, with consideration of budget constraints and advertisers’ risk-
tolerance. A branch-and-bound algorithm is developed to solve our model. Furthermore, we
conduct computational experiments to evaluate the effectiveness of our model and solution, with
two real-world datasets collected from reports and logs of search advertising campaigns.
Experimental results illustrated that our keywords grouping approach outperforms five baselines,
and it can approximately approach the optimum in a steady way. This research generates several
interesting findings that illuminate critical managerial insights for advertisers in sponsored search
advertising. First, keywords grouping does matter for advertisers, especially in the situation with
a large number of keywords. Second, in keyword grouping decisions, the marginal profit does not
necessarily show the marginal diminishing phenomenon as the budget increases. Such that, it’s a
worthy try for advertisers to increase their budget in keywords grouping decisions, in order to
obtain additional profit. Third, the optimal keywords grouping solution is a result of multifaceted
trade-off among various advertising factors. In particular, assigning more keywords into adgroups
or having more budget won’t certainly lead to higher profits. This suggests a warning for
advertisers that it’s not wise to take the number of keywords as the criterion for keywords grouping
decisions.
Keywords: keywords grouping, keyword decisions, sponsored search advertising, chance
constrained programming
Huiran Li & Yanwu Yang (2020). Optimal Keywords Grouping in Sponsored Search Advertising
under Uncertain Environments, International Journal of Electronic Commerce, 24(1), 107-129.
DOI: https://doi.org/10.1080/10864415.2019.1683704.
Sponsored search advertising has evolved into one of the most prominent online advertising
channels [ 66 ]. Millions of advertisers choose search advertising to promote their products and
services, taking advantage of precise targeting [ 9 ], low advertising costs [ 53 ] and high return on
investment [ 28 , 31 ]. Internet advertising revenues hit a record high of $107.5 billion in 201 8 , where
search advertising accounts for 45.1% of that pie [ 25 ]. In sponsored search advertising, advertisers
need to make a series of keyword related decisions. Indeed, keywords serve as a bridge linking
advertisers, search users and search engines [ 65 ]. Different from other forms of online advertising,
search advertisers have to organize keywords according to advertising structures defined by search
engines. Well-organized keywords can secure more traffics and revenues through serving the right
ads to the right customers [ 63 ]. Therefore, for search advertisers, how to effectively organize
keywords of interest in their campaigns is a critical issue.
Throughout the entire lifecycle of search advertising campaigns, advertisers have to face a
series of keyword related decisions, namely keyword generation, selection, grouping and
adjustment [ 65 ]. Current research efforts along the line of keyword research mainly focus on
keyword generation (e.g., [ 42 , 43 , 68 ]) and keyword selection (e.g., [ 29 , 33 ]). From the operational
perspective, there is yet to explore keyword decisions on how to organize keywords according to
advertising structures defined by search engines.
Sponsored search advertising campaign development involves organizing keywords into
adgroups, developing adcopies for adgroups [8]. In the search advertising structure employed
by major search engines (e.g., Google, Bing), under an advertiser’s account, one or several
campaigns run simultaneously to fulfill a certain promotional goal, where each campaign includes
one or several adgroups, and each adgroup in turn contains one or more adcopies and a shared set
of keywords. Naturally, adgroup is prevalent as the basic unit for daily advertising operations. First,
Bing Ads (2019) [5] states that adgroups are the best way to organize advertising campaigns. In
particular, adgroup allows advertisers to better track the effectiveness of their advertising efforts
[ 19 ]. Second, an advertiser needs to build a keyword list for each adgroup from a predefined set
of keywords of interest in order to precisely display her ads to the targeted consumers [20]. When
a searcher’s query matches one or more keywords in an adgroup, its associated advertisement will
be triggered to appear on the search engine results pages. Thus, organizing keywords with
There are many challenges associated with keywords grouping decisions. On one hand, the
search advertising environment is highly uncertain [ 45 , 60 ]. That is, advertisers have to make
keywords grouping decisions before obtaining values of keyword performance indexes. As [ 36 ]
stated, IT-based high-technology industries share common characteristics, which are featured by
market uncertainty, technology uncertainty, and competitive volatility [ 59 ]. Likewise, search
advertising also suffers from three types of uncertainties: disturbance coming from market noise
(i.e., social hot news makes search volume and click amount of some keywords increase sharply),
uncertainty stemming from technology evolutions (i.e., search engines improve their ranking
algorithms for advertising display), and uncertainty originating from competitive volatility (i.e.,
advertisers can adjust their strategy arbitrarily). On the other hand, advertisers, especially those
from small enterprises, usually face serious budget constraints [ 61 , 62 ]. It implies that advertisers
need to appropriately group keywords under a limited budget in order to maximize their
advertising performance. The uncertainties of search advertising are reflected by several
performance indexes. More specifically, click-through rate (CTR) and conversion rate (CVR) vary
drastically over keywords and are often unknown in advance, which raises large uncertainties for
keywords grouping decisions [ 16 ]. This motivates us to study the keywords grouping problem in
a stochastic model. Our premise is that advertisers can obtain limited amount of information about
the range of values taken by factors of interest (e.g., probability distributions) by analyzing
historical reports from search advertising campaigns. The intended work is different from the
above keywords grouping approaches in twofold. First, our approach considers uncertainties in
search advertising markets and conducts risk management for advertisers with different risk
preferences. Second, our approach based on the branch and bound algorithm can obtain the optimal
solution by traversing the solution space of keywords grouping decisions.
In this work, we formulate a stochastic model for keywords grouping to maximize the
expected profit from a search advertising campaign. In particular, our model takes CTR and CVR
as random variables
1
. First, we use the concept of chance constraint to describe the probability of
meeting the budget constraint within a certain degree. Second, the variance of profit over a unit of
1
Note that CTR and CVR might be, more or less, improved through keywords grouping strategies, however, which is
not the ultimate goal, but intermediary performance indexes, for advertisers. Moreover, neither CTR nor CVR provides
comprehensive clue for the ultimate advertising goal [26]. In this work, we distinguish the CTR (CVR) inherited in
keywords themselves and the CTR (CVR) raised by adgroups, and use the product of them to represent the CTR
(CVR) of a keyword assigned in an adgroup.
budget at the campaign level is used to measure an advertiser’s risk, in order to balance expected
profit and risk exposures. Then we develop a branch-and-bound solution process to solve our
keywords grouping model. Furthermore, we conduct computational experiments to evaluate the
performance of our keywords grouping model with two real-world datasets collected from field
reports and logs of search advertising campaigns, by comparing with five baselines. Among them,
the first two are commonly used in practice, and the third and the fourth are derived from the extant
literature, and the fifth is a deterministic approach derived from our approach. The first baseline
represents the case that the advertiser puts all keywords into a single adgroup (i.e., BASE1-
Nogrouping). The second baseline subdivides keywords according to the advertiser's products to
be promoted to form adgroups (i.e., BASE2-Product). The third applies k-means clustering to
segment keywords (i.e., BASE 3 - Kcluster). The fourth categorizes keywords according to a
keyword hierarchy based on semantic relationships (i.e., BASE 4 - Hierarchy). The fifth baseline
assigns keywords into adgroups according to their profits in a greedy manner, by following a
deterministic model derived from our stochastic keywords grouping model developed in Section
4 (i.e., BASE 5 - Profit).
Experimental results show that a) our keywords grouping approach outperforms five
baselines in terms of the profit and ROI, with relatively lower risks; b) compared to five baselines,
our approach assigns more keywords and can approximately approach the optimum in a steady
way; c) in the keywords grouping decisions, as the budget increases, the profit grows accordingly;
however, the marginal profit does not necessarily show the marginal diminishing effect, i.e., it
does not always decrease with the increase of the budget; d) assigning more keywords into
adgroups won’t certainly lead to a higher profit. Essentially, the optimal keywords grouping
solution is a result of multifaceted trade-off among various advertising factors.
These findings provide critical managerial insights for advertisers in sponsored search. First,
keywords grouping is a critical advertising decision that cannot be overlooked, especially under
this more complicated market environment with a large number of keywords. Second, increasing
the budget in keywords grouping decisions can be a worthy try for advertisers to obtain additional
profit. Third, this research suggests a warning for advertisers that it’s not wise to take the number
of keywords as the criterion for keywords grouping decisions.
The key contributions of our study include the following. From the academic perspective, to
our knowledge, this is the first study on keyword grouping decisions. In the extant literature, few
majority of research on search advertising decisions has focused on bidding strategy [ 4 , 10 , 14 , 50 ,
52 , 67 ], budget allocation [ 62 ], and keyword decisions [1, 38 , 44 , 47 , 65 ].
From the perspective of search engines, theoretical and empirical analyses in [ 15 ] suggested
that strategic behaviors are widespread and costly, and a switch to a VCG-based mechanism might
stabilize auction outcomes with neutral or even positive effects on revenues, at least relative to the
old Overture mechanism that was based on the first-price auction. By considering bid dynamics
and rankings of advertisers, [ 69 ] proposed a dynamic model and identified an equilibrium bidding
strategy. Their empirical framework, based on a Markov switching regression model, suggested
the existence of cyclical bidding strategies. Using a game-theoretic model, [ 4 ] examined the
strategic role of keyword management costs aroused from advertisers’ decisions (e.g., the
keywords they choose to bid on and their bidding prices) and of broad match which automates
bidding on keywords, in sponsored search advertising. Their analysis showed that the search
engine will increase broad match bid accuracy up to the point where advertisers choose broad
match, but increasing the accuracy any further reduces the search engine’s profits. Through
building a Hierarchical Bayesian model to address the endogeneity problem and using the Markov
Chain Monte Carlo (MCMC) method to identify the parameters, [ 14 ] empirically explored how to
manage ad campaigns when advertisers have to bid on multiple keywords. The results suggested
that it is important to differentiate among the various bidding strategies for various keyword
categories and match types. [ 69 ] modeled the budget-constrained bidding as a stochastic multiple-
choice knapsack problem, and designed an algorithm that selects items online based on a threshold
function which can be built based on historical data. Their algorithm achieved about 99%
performance compared to the offline optimum when applied to a real bidding dataset. Another
stream of search advertising decision research is budget allocation. Effectively allocating the
limited advertising budget is a critical search advertising decision. With multiple search
advertising markets and a finite time horizon, [ 62 ] developed a novel budget allocation
optimization model. A customized advertising response function was proposed when considering
distinctive features of sponsored search, including the quality score and the dynamic advertising
effort. [ 37 ] explored how to distribute advertising budget over the keywords of their interest in
order to maximize their return. The results showed that simple prefix strategies that invest on all
cheap keywords up to some levels are either optimal or good approximations for many cases.
However, as a matter of fact, advertisers are not allowed to spread their budget across keywords
directly in actual sponsored search advertising.
In the next section, we narrow down to keyword related decisions in sponsored search
advertising.
2.2. Keyword Decisions
Keywords serve as an essential bridge linking advertisers, search users and search engines in
sponsored search advertising. Advertisers have to deal with a series of keyword decisions
throughout the entire lifecycle of search advertising campaigns, including keywords generation,
selection, grouping and adjustment [ 65 ]. They developed an integrated multi-level computational
framework for keyword optimization (MKOF) supporting a set of strategies across different levels
of abstractions (e.g., domain, market, campaign, adgroup and keyword) throughout the lifecycle
of sponsored search advertising campaigns. Moreover, advertisers have to monitor the realtime
performance of advertising campaigns and adjust their keyword decisions accordingly. Existing
research on keyword related decisions primarily focuses on the first two issues.
Keywords generation can be categorized into three streams, i.e., query log-based, proximity-
based and meta-tag crawlers-based methods [1, 42 ]. In the branch of query log-based methods,
keywords are mainly suggested by conducting association/co-occurrence analysis in search engine
query logs [ 34 , 68 , 70 ]. Proximity-based keyword generation methods query search engines with
the seed keyword and recommend keywords from the query results possessing high proximity to
the seed keyword [1, 58 ]. In addition, some efforts calculate the proximity based on vocabulary
dictionaries/corpus pre-constructed by domain experts [ 11 ], e.g., thesaurus dictionary, Wikipedia,
etc. The meta-tag crawlers based methods focus on finding relevant keywords from meta-tags.
They send the seed keyword to the search engine and extract meta-tag keywords from the top
ranked web pages [ 42 ]. Some popular online tools like WordStream and Wordtracker use meta-
tag crawlers to search meta-tag keywords and make suggestions of relevant keywords for
advertisers.
Selecting the most appropriate keywords after keyword generation help prevent advertisers
from targeting wrong groups of consumers and eventually wasting their advertising budget with
poor returns [ 27 ]. [ 45 ] selected keywords by ranking them on their profit-to-cost ratio which
guarantees the conversion of the average expected profit to a near-optimal solution. [ 29 ] proposed
to optimize advertising keywords with feature selection techniques applied to the set of all possible
In sponsored search advertising, keywords grouping decisions are influenced by many factors
(e.g., CTR, CVR) that cannot be known precisely in advance. This motivates us to explore the
keywords grouping problem in the stochastic setting with consideration of budget constraints of
adgroups and advertisers’ risk-tolerances. In this work, we are intended to explore the problem of
keywords grouping under uncertainty environment. To the best of our knowledge, this is the first
research effort in this direction.
In this section, we build a stochastic model for keywords grouping to maximize the expected profit
in sponsored search advertising, with consideration of budget constraints of adgroups and
advertisers’ risk-tolerances. There might be other constraints for advertisers (e.g., geography and
time), our research considers budget constraints and risk constraints that are commonly taken into
account in prior work (e.g., [5 1 , 60]). The keyword decision scenario under consideration by this
research is: for an advertiser, given a set of campaign-specific keywords, how to group these
keywords into several adgroups. The notations used in this paper are listed in Table 1.
3.1 The Objective
Let 𝑑 !
denote the total number of search demands of the 𝑖
"#
keyword in a search market. The
search demand of a keyword is defined as the total number of queries triggered from it. Let 𝑐 !$
denote the click-through rate (CTR) of the 𝑖
"#
keyword in the 𝑗
"#
adgroup. Given an advertising
campaign with 𝑚 adgroups and a set of keywords (i.e., 𝑛), the decision variable 𝑥 !$
𝑖 = 1 , … , 𝑛, indicates whether the 𝑖
"#
keyword is assigned to the 𝑗
"#
adgroup or not, i.e.,
!$
"#
"#
Let 𝑝
!$
denote the cost-per-click (CPC) of the 𝑖
"#
keyword in the 𝑗
"#
adgroup. According to
major search advertising structures, advertisers can set the max CPC on both the adgroup and
keyword levels. In our research, for the keywords grouping problem, we use 𝑝 !$
on the keyword
level. Then the cost for a campaign is
!$
!
!$
!$
%
!&'
(
$&'
. Let 𝑟
!$
and 𝑣
!
denote the conversion
rate (CVR) and value-per-sale, respectively. Thus, the profit of an advertising campaign can be
represented as 𝑧B𝑥
!$
!$
!
!$
!$
!
!$
%
!&'
(
$&'
. In this research, we use CTR 𝑐
!)
and CVR
!$
as random vectors to capture uncertainties in searchers’ behaviors, advertising market volatility,
etc. So 𝑧B𝑥
!$
C is also a random variable. Therefore, the objective of keywords grouping decisions
is to maximize the profit expected in an ad campaign, given by 𝐸H𝑧B𝑥 !$
!$
!
!$
!$
!
!$
%
!&'
(
$&'
3.2 The Budget Constraint
Advertisers usually have a limited budget for search advertising. We can naturally assume that the
budget is less than a sufficient amount. Let 𝐵 $
> 0 denote the advertising budget available to a
given adgroup 𝑗, then we have
!$
!
!$
!$
%
!&'
$
Due to the stochastic nature of 𝑐
!$
, the budget constraint can be represented as a chance
constraint, i.e., 𝑃N
!$
!
!$
!$
%
!&'
$
$
, where the probability that the cost of adgroup 𝑗 is
less than the allocated budget, is greater than or equal to a certain level 𝛼
$
(i.e., an acceptable
probability range). In order to simplify the expression, we also treat the cost of a keyword 𝑠 !
!
!$
!$
!$
(
$&'
as a random variable. Then we have 𝑃N
!$
!
%
!&'
$
$
3.3 The Risk Constraint
Our keywords grouping model also considers different risk preferences from advertisers. A risk-
averse advertiser prefers certainty to risk, and low risk to high risk, thus prefers a strategy within
her risk tolerance; while a risk-loving advertiser would prefer the chance of getting more revenue
at the cost of high risk; A risk neutral advertiser would not have any preference. [ 23 ] stated that
the profit variance can be interpreted as a risk measure in advertising market. Following [ 60 ], in
order to balance the expected profit and risk exposures, we take the variance of profit 𝑧(𝑥 !$
) over
a unit of budget as the risk, given as
!$
$
(
$&'
!$
!
!$
!$
!
!$
%
!&'
(
$&'
$
(
$&'
where 𝜃 is the risk-tolerance of an advertiser.
3.4 The Stochastic Keywords grouping Model
In summary, the keywords grouping problem can be formulated as the following stochastic model:
max 𝐸 VW W 𝑥
!$
!
!$
!$
!
!$
%
!&'
(
$&'
process for our keywords grouping model. For more details on branch-and-bound algorithm, refer
to see [ 30 ].
First, we use a stochastic simulation to check whether the chance constraint of budget is
satisfied for each adgroup, which is given in Algorithm SSCCAB (standing for stochastic
simulation for chance constraints of advertising budget). When assigning a keyword into adgroup
𝑗, if and only if (iff) the total cost is less than the budget constraint within confidence interval for
adgroup 𝑗, i.e., 𝑃N∑ 𝑥 !$
!
%
!&'
$
$
, then the indicative variable 𝑥d
!$
= 1 , otherwise 0.
Algorithm (SSCCAB)
Input:
$
"#
adgroup
$
"#
adgroup
!
!
!
"#
keyword cost
Output: 𝑥d
!$
are satisfied when assigning the 𝑖
"#
keyword into the 𝑗
"#
adgroup
Procedure:
'
%
from the
corresponding distribution of 𝑠
!
!
!
as a sample.
!$
!
% +
!&'
$
then we have 𝑡
$
6. If 𝛼
$
$
then 𝑥d
!$
=1; else 𝑥d
!$
Next, we calculate the upper bound for the branch and bound algorithm through continuous
relaxation of model (1). Specifically, we relax 𝑥 !$
from a binary variable in {0,1} to a continuous
variable in [0,1]. Following [ 41 ], it is known that the set defined by constraint 𝑃N∑ 𝑥 !$
!
%
!&'
$
$
is convex if function
!$
!
%
!&'
is quasi-convex and 𝑠
!
has a log-concave density. The
first property can easily be proved as our function
!$
!
%
!&'
is linear, thus it is quasi-convex. With
regard to the second property, according to [ 12 ], the number of clicks per impression 𝑐 !$
(i.e.,
CTR), has dimensions [click/impr]. It is a Bernoulli random variable with parameter 𝑝(𝑥 !$
representing the possibility of an advertisement associated with the 𝑖
"#
keyword being clicked if it
is assigned to the 𝑗
"#
adgroup. Then the number of clicks of the 𝑖
"#
keyword 𝐶
!
is a binomial
random variable with parameters(𝑑
!
!$
)). The binomial can be accurately approximated by the
normal provided that 𝑑 !
!$
) ≥ 10 and 𝑑
!
!$
CI ≥ 10. Such that we naturally assume
that the random variable 𝐶
!
(i.e., the number of clicks) is normal
2
. Thus, the cost 𝑠
!
of keyword 𝑖,
i.e., the product of the number of clicks (the random variable) and the average cost per click
(constant), is also independently normally distributed. The second property can be proved for
normal distributions. This means that the chance constraint 𝑃N
!$
!
%
!&'
$
$
defines a
convex set in the special case of a relaxed keywords grouping problem with normally distributed
costs.
Then we can solve the continuous chance-constraint keywords grouping model by
reformulating it as an equivalent, deterministic second-order-cone-programming (SOCP) problem
[ 32 ]. From search advertising logs and reports, we can get the mean 𝜇
!
and standard deviation 𝜎
!
of 𝑠
!
. As 𝐵
$
is a constant, 𝑉𝑎𝑟H𝐵
$
$
$
. Then we have
∑ -
!"
.
!
!$%
/ 0
"
/ 1 ∑ -
!"
2 [.
!
]
!$%
/ 0
"
5
6
∑ -
!"
&
789 [.
!
]
!$%
which represents a standard normal variant.
The inequality
!$
!
%
!&'
$
is equivalent to
∑ -
!"
.
!
!$%
/ 0
"
/ 1 ∑ -
!"
2 [.
!
]
!$%
/ 0
"
5
6 ∑ -
!"
&
789 [.
!
]
!$%
∑ -
!"
2 [.
!
]
!$%
/ 0
"
6 ∑ -
!"
&
789 [.
!
]
!$%
Then the chance constraint 𝑃N
!$
!
%
!&'
$
$
is equivalent to
𝑃 p𝜂 ≤ −
∑ -
!"
2 [.
!
]
!$%
/ 0
"
6 ∑ -
!"
&
789 [.
!
]
!$%
r ≥ 𝛼
$
where 𝜂 obeys a standard normal distribution.
2
We implicitly assume that the parameter 𝑑
'
is reasonably large so that the two conditions given above are
satisfied.
Algorithm (BBKG)
Input:
$
"#
adgroup
!
"#
keyword
!$
"#
keyword in the 𝑗
"#
adgroup
!$
"#
keyword in the 𝑗
"#
adgroup
!
"#
keyword
!$
"#
keyword in the 𝑗
"#
adgroup
𝜃 – the risk-tolerance
Output: 𝑥
!$
"#
keyword is assigned to the
"#
adgroup.
Procedure:
$
, sort keywords according to decreasing
!
!$
!$
!
!$
)I, and Keywords_Grouping_List = ∅.
2 ) F or adgroup 𝑗 from 1 to m
for keyword 𝑖 from 1 to n
if 𝑥d
!$
= 1 , Var%
∑ ∑ 𝑥
'(
𝑑
'
𝑐
'(
%𝑟
'(
𝑣
'
− 𝑝
'(
)
'*+
,
(*+
∑ 𝐵
(
,
(*+
≤ 𝜃 and
!$
(
)&'
≤ 1 then 𝑥
!$
INF = max {the expected profit}, add the feasible solution to
Keywords_Grouping_List, and the upper bound SUP = ∞.
End for
End for
Keywords_Grouping_List with maximum expected profit, go to step 4.
list then go to step 3.
a plunged or rejected subset, then delete the solution from the list then go back to step 3,
else following the ranking, choose the first accepted keyword that does not already have
a plunged or rejected subset calculate SUP for the subset defined by rejecting this
keyword, go to step 6.
in 2 and add the found branch together with the value SUP to the
Keywords_Grouping_List.
If the expected profit of this solution > INF, then update INF, go to step 3.
Algorithm BBKG searches the complete space of solutions for the optimal keywords
grouping solution within budget chance-constraints and risk-tolerance. The keywords grouping
solution is a n ∗ m 0 - 1 matrix. At any point during the process, the status with respect to the search
of the keywords grouping solution space is described by a pool of yet unexplored subsets of the
space and the best keywords grouping solution found so far. Initially, only one subset exists,
namely the complete solution space, and the best solution found so far is ∞. The unexplored
subspaces are represented as nodes in a dynamically generated search tree, which initially only
contains the root, and each iteration of a keywords grouping branch and bound algorithm processes
one such node. The iteration has three main components: selection of the node to process, bound
calculation, and branching. Our strategy for selecting the node to process is in descending order of
expected keyword profit. The operation of an iteration after choosing the node is branching, i.e.
subdivision of the solution space of the node into 𝑚 + 1 subspaces (i.e., 𝑚 + 1 represents the
cases that the keyword is assigned into one of the m adgroups or no adgroup) to be investigated in
a subsequent iteration. For each of these, in descending order of the adgroups budget, the bounding
function for the subspace is calculated and compared to the current best solution and then branch
on the node if necessary. The bound is calculated through using interior point method to solve the
continuously relaxed keywords grouping model. If it can be established that the subspace cannot
contain the optimal solution, the whole subspace is discarded, else it is checked whether the
subspace consists of a better solution compared to the current best keywords grouping solution
keeping the best of these. The search terminates when there is no unexplored parts of the solution
space left, and the optimal solution is then the one recorded as ”current best”. For details about the
solution space of branch and bound algorithm, see [ 13 ].
is divided into three adgroups by the advertiser originally, i.e., basketball (with keywords such as
“basketball shoes”, “cheap basketball shoes”, “kids basketball shoes”, “kobe basketball shoes”,
“high top basketball sneakers”, etc.), running (with keywords such as “running shoes”, “mens
runners”, “buy running shoes”, “running sneakers”, “running shoe online”, “running shoes for
men”, etc.) and soccer (with keywords such as “soccer shoes”, “indoor soccer shoes”, “soccer
cleats”, “soccer boots”, “kids soccer cleats”, etc.). The potential customers of the three adgroups
have interests in shoes for different types of sports. This dataset contains 305 keywords for three
adgroups. Dataset- 2 contains records for keywords identical to Dataset- 1 , the mean and standard
deviations of random factors can be obtained in a similar way. Summary statistics for Dataset- 2
are shown in Table 3.
The two datasets are quite rich to investigate the effectiveness of keywords grouping model
and solution. We assume that there is no significant difference in ad quality for keyword-ad pairs,
as this is a well-developed search advertising effort over multiple years.
5.2 Experimental Setup
The following experiments are set up as follows. For the first dataset, the total cost of these
keywords in the chosen ad campaign is 19, 200. In experiments on Dataset- 1 , we increase the total
campaign budget from 2,000 to 20,000 by a step of 2, 000 , which is allocated to the two adgroups
at the ratio of 2:1. For the second dataset, the total cost of these keywords in target ad campaign is
66 , 786. In experiments on Dataset- 2 , we increase the total campaign budget from 10 , 000 to 70 , 000
by a step of 1 0, 000 , which is allocated to the three adgoups at the ratio of 3:2:1. In the following
experiments, the probability of chance constraint (i.e., 𝛼 $
) is set as 0.95. At different levels of
campaign budget, the risk-tolerance (i.e., 𝜃) for risk-loving advertisers is 𝜃 = ∞, and for risk-
averse advertisers, 𝜃 = 0. 3.
5.3 Comparisons
We compare our approach (BBKG) with five baselines with respect to profit, ROI and the number
of keywords assigned to adgroups. As far as we knew, there is limited research on keywords
grouping and no comparative approach reported in the state-of-the-art literature. For comparison
purposes, we implement two baseline approaches commonly used in practice and two baselines
derived from the literature on keyword clustering, and the fifth is a deterministic approach derived
from our approach. The first baseline represents the case that the advertiser puts all keywords into
a single adgroup (i.e., BASE1-Nogrouping). The second subdivides the keywords according to
products to be promoted by the advertiser (i.e., BASE2-Product). The third baseline approach (i.e.,
BASE 3 - Kcluster) is derived from a k-means clustering algorithm applied in [ 39 ] to understand the
underlying intent of the query terms, which categorizes keywords with similar characteristics of
onsite behaviors, such as pages per visit and click-through rate. In our context, the BASE3-
Kcluster categorizes keywords with a set of characteristics associated with each referral keyword,
including impressions, click-through rate, cost-per-click, conversion rate and value-per-sale. The
fourth baseline approach (i.e., BASE4-Hierarchy) is derived from the keyword hierarchy [3].
Specifically, a domain-specific concept hierarchy is constructed on the basis of a high-quality Web
directory such as Wikipedia, and then a keyword hierarchy is established by matching keywords
with relevant concepts. Based on this keyword hierarchy, keywords can be grouped into several
subsets related to different topics. The fifth baseline (i.e., BASE5-Profit) orderly assigns keywords
into adgroups according to their profits obtained in a greedy manner following a deterministic
model derived from our stochastic keywords grouping model developed in Section 4. In the
following experiments, we assign keywords into adgroups using our solution proposed in Section
4 and five baselines independently. Note that our experiments are conducted based on the two
realworld datasets about past advertising campaigns in laboratory.
Figure 1 show the profit and ROI obtained by our approach (BBKG) and five baselines at
different levels of campaign budget on Dataset-1, respectively. Corresponding results on Dataset-
2 are shown in Figure 2.
From Figures 1 and 2, we observe the following:
(1) On both Dataset-1 and Dataset-2, profits obtained by our approach and the five baselines
increase with the total campaign budget. In general, with more budget available, more keywords
are included to adgroups, and then more profit is generated.
(2) On both Dataset-1 and Dataset-2, our approach (BBKG) outperforms the five baselines in
terms of the profit and ROI. This is because, on one hand, our approach can traverse more
possibilities by considering uncertainties. On the other hand, there exist a few popular keywords
that are of high profit but expensive. These baselines assign popular keywords to adgroups, instead
of less-popular keywords with fair profit (or ROI). However, our approach based on the branch
and bound algorithm can avoid such situation by traversing the solution space of keywords
grouping decisions.