Run URL Prioritization (Steps 1-6)

OVERVIEW:

https://dev.myersmediagroup.com/mmgsvn/KMS/trunk/URLPriority/note.txt (Steps 1-6)

PROCESS:

1. Apply migrations under https://dev.myersmediagroup.com/mmgsvn/KMS/trunk/URLPriority/db/migration

2. Make sure the following columns are populated for every template. Note the default is 0 for those columns, so 0-values could mean the correct value was not set, especially for custom templates.

  • url_priority_value
    • Higher value means higher priority.
    • Avoid using the same value across different templates.
  • url_region_priority_damping_factor
    • The multiplying factor for region priority of the URLs in this template.
    • For example Hotels Reviews (t5) should have a very low value, to get lower region priority.
  • url_region_priority_default
    • Default value when region priority of the URLs in this template cannot be calculated.
    • If all URLs in a given template should have region information (region_id), then the url_region_priority_default should be 0.

3. Create entries in URL_Priority_Factor

insert into up.URL_Priority_Factor (name, description, query, need_normalized, weight) values ('Visits', 'Total visits from SEO, Google channel only during last 12 months.', 'select suv.site_url_id as url_id, isnull(e.visitor_count, 0) as value from dbo.SMM_URL_Validation suv with(nolock) left outer join (select site_url_id, sum(visitor_count) as visitor_count from dbo.EDW_URLBased_12mo with(nolock) where site_url_id is not null and search_engine_id = 1 and search_type_id = 1 group by site_url_id) e on suv.site_url_id = e.site_url_id', 1, 10001);
 insert into up.URL_Priority_Factor (name, description, query, need_normalized, weight) values ('Transactions', 'Total transactions from SEO, Google channel only during last 12 months.', 'select suv.site_url_id as url_id, isnull(e.transactions, 0) as value from dbo.SMM_URL_Validation suv with(nolock) left outer join (select site_url_id, sum(transactions) as transactions from dbo.EDW_URLBased_12mo with(nolock) where site_url_id is not null and search_engine_id = 1 and search_type_id = 1 group by site_url_id) e on suv.site_url_id = e.site_url_id', 1, 9001);
 insert into up.URL_Priority_Factor (name, description, query, need_normalized, weight) values ('Gross Bookings', 'Total gross bookings from SEO, Google channel only during last 12 months.', 'select suv.site_url_id as url_id, isnull(e.gross_bookings_usd, 0) as value from dbo.SMM_URL_Validation suv with(nolock) left outer join (select site_url_id, sum(gross_bookings_usd) as gross_bookings_usd from dbo.EDW_URLBased_12mo with(nolock) where site_url_id is not null and search_engine_id = 1 and search_type_id = 1 group by site_url_id) e on suv.site_url_id = e.site_url_id', 1, 7001);
 insert into up.URL_Priority_Factor (name, description, query, need_normalized, weight) values ('Shoppers', 'Total shoppers from SEO, Google channel only during last 12 months.', 'select suv.site_url_id as url_id, isnull(e.shopper_counts, 0) as value from dbo.SMM_URL_Validation suv with(nolock) left outer join (select site_url_id, sum(shopper_counts) as shopper_counts from dbo.EDW_URLBased_12mo with(nolock) where site_url_id is not null and search_engine_id = 1 and search_type_id = 1 group by site_url_id) e on suv.site_url_id = e.site_url_id', 1, 6001);
 insert into up.URL_Priority_Factor (name, description, query, need_normalized, weight) values ('Region Priority', 'URL''s region priority, calculated from multiplying region priority with dbo.Template.url_region_priority_damping_factor.', 'select suv.site_url_id as url_id, t.url_region_priority_damping_factor * isnull(rpf.priority_factor, t.url_region_priority_default) as value from dbo.SMM_URL_Validation suv with(nolock) join dbo.Template t with(nolock) on suv.template_id = t.template_id left outer join dbo.Site_URL_Contextual suc with(nolock) on suv.site_url_id = suc.site_url_id left outer join dbo.Region_Priority_Factor rpf with(nolock) on suc.region_id = rpf.region_id where suv.template_id <> 3008 union select suv.site_url_id as url_id, t.url_region_priority_damping_factor * isnull((r1.priority_factor+r2.priority_factor)/2, t.url_region_priority_default) as value from dbo.SMM_URL_Validation suv with(nolock) join dbo.Template t with(nolock) on suv.template_id = t.template_id left outer join ds.Route_Region rr with(nolock) on suv.site_url_id = rr.site_url_id left outer join dbo.Region_Priority_Factor r1 with(nolock) on rr.origin_region_id = r1.region_id left outer join dbo.Region_Priority_Factor r2 with(nolock) on rr.destination_region_id = r2.region_id where suv.template_id = 3008', 0, 19.1);
 insert into up.URL_Priority_Factor (name, description, query, need_normalized, weight) values ('Template Priority', 'URL''s template priority calculated from dbo.Template.url_priority_value.', 'select suv.site_url_id as url_id, t.url_priority_value as value from dbo.SMM_URL_Validation suv with(nolock) join dbo.Template t with(nolock) on suv.template_id = t.template_id', 0, 10.1);
 insert into up.URL_Priority_Factor (name, description, query, need_normalized, weight) values ('eSV', 'Estimated Search Volume for URL.', 'select site_url_id as url_id, sv as value from (select kuma.*, isnull(vks.monthly_sv, -1) as sv, row_number() over (partition by kuma.site_url_id order by isnull(vks.monthly_sv, -1) desc) as rn from dbo.KW_URL_Main_Assignment_Distribution kuma with(nolock) left outer join kwg.vw_KW_SV vks with(nolock) on kuma.kw_id = vks.kw_id and vks.row_num = 1 where exists (select 1 from dbo.KW_Collection kc with(nolock) join dbo.Collection c with(nolock) on kc.collection_id = c.id and c.name = ''ILG Inclusion List'' where kc.kw_id = kuma.kw_id) ) t where rn = 1', 1, 5001);

* Please validate the “query” column if necessary, e.g. different (custom) templates might need ways to calculate ‘Region Priority’

4a. (Optional) Purge SEO_StaticURLs table to improve performance of the next step.

delete ds.SEO_StaticURLs
from ds.SEO_StaticURLs s
join SMM_URL_Validation v on dbo.udf_strip_url_protocol(s.url) = v.url_no_protocol and s.template_id != v.template_id;

4. Repopulate edw_urlbased_12mo table, which used to calculate URL priority.

4.1. For non-primary-language POS, import EDW data from the primary-language POS first with something like:

4.1.1 Method A

-- If one does not already exist, create a dummy EDW_File entry:
 insert into dbo.EDW_File (name, status, updated, note) values (
 '[dummy]', 0, getdate(), 'created manually by {user}');

-- Populate dbo.EDW_URL primary table:
 set identity_insert EDW_URL on;
 insert into dbo.EDW_URL (id, url)
 select distinct id, url
 from {primary_gkms}.dbo.EDW_URL u
 where url_no_protocol like '{domain}/{secondary_site_identifier}/%';

set identity_insert EDW_URL off;

-- Populate EDW_URLBased_Weekly table:
 insert into dbo.EDW_URLBased_Weekly (
 edw_file_id, rpt_wk_begin_date, url, edw_url_id, omniture_market_clicks_page, search_engine, search_engine_id, search_type_id, mktg_sub_chnnl_name, is_branded, lob, visitor_count, shopper_counts, bounce_visitor_cnt, transactions, gross_bookings_usd, gross_profit_usd)
 select {edw_file_id}, rpt_wk_begin_date, url, edw_url_id, omniture_market_clicks_page, search_engine, search_engine_id, search_type_id, mktg_sub_chnnl_name, is_branded, lob, visitor_count, shopper_counts, bounce_visitor_cnt, transactions, gross_bookings_usd, gross_profit_usd
 from {primary_gkms}.dbo.EDW_URLBased_Weekly
 where url_no_protocol like '{domain}/{secondary_site_identifier}/%'

4.1.1 Method B

-- Copy EDW_File entry:
 :setvar primary_gkms ""

SET IDENTITY_INSERT EDW_File ON
 go

declare @last_id int;
 select @last_id = max(id) from EDW_File;

insert into EDW_File (id, name, status, updated, note, year_month)
 select id, name, status, updated, note, year_month
 from $(primary_gkms).dbo.EDW_File
 where id > @last_id;

SET IDENTITY_INSERT EDW_File OFF
 go

-- Populate dbo.EDW_URL primary table:
truncate table edw_urlbased_weekly;
 delete from EDW_URL;

set identity_insert EDW_URL on;

insert into dbo.EDW_URL (id, url)
 select distinct id, url
 from $(primary_gkms).dbo.EDW_URL u
 where -- list all posibilities
 (url_no_protocol like 'www.expedia.ca/fr/%' or
 url_no_protocol like 'flights.expedia.ca/ca-fr/%' or
 url_no_protocol like 'cruise.expedia.ca/Cruise/fr-CA/%' or
 url_no_protocol like 'thingstodo.expedia.ca/ca-fr/%' or
 url_no_protocol like 'www.expedia.ca/things-to-do/fr/%')

set identity_insert EDW_URL off;
 
 insert into dbo.EDW_URLBased_Weekly (
 edw_file_id, rpt_wk_begin_date, url, edw_url_id, omniture_market_clicks_page, search_engine, search_engine_id, search_type_id, mktg_sub_chnnl_name, is_branded, lob, visitor_count, shopper_counts, bounce_visitor_cnt, transactions, gross_bookings_usd, gross_profit_usd)
 select edw_file_id, rpt_wk_begin_date, url, edw_url_id, omniture_market_clicks_page, search_engine, search_engine_id, search_type_id, mktg_sub_chnnl_name, is_branded, lob, visitor_count, shopper_counts, bounce_visitor_cnt, transactions, gross_bookings_usd, gross_profit_usd
 from $(primary_gkms).dbo.EDW_URLBased_Weekly
 where -- list all posibilities
 (dbo.udf_strip_url_protocol(url) like 'www.expedia.ca/fr/%' or
 dbo.udf_strip_url_protocol(url) like 'flights.expedia.ca/ca-fr/%' or
 dbo.udf_strip_url_protocol(url) like 'cruise.expedia.ca/Cruise/fr-CA/%' or
 dbo.udf_strip_url_protocol(url) like 'thingstodo.expedia.ca/ca-fr/%' or
 dbo.udf_strip_url_protocol(url) like 'www.expedia.ca/things-to-do/fr/%');

4.2. For all languages, run

exec usp_populate_edw_urlbased_12mo
  • Note that the output message might indicate a need to run SMG on template 102.
  • Check to see if most of the URLs in edw_urlbased_12mo have been generated:
select count(*), count(distinct site_url_id) from edw_urlbased_12mo;
select count(*) from edw_urlbased_12mo where site_url_id is null;

5. Scrub Foreign Domain URLs.

  • Check and remove all URLs that do match the correct domain. i.e. hotels.com URLs in expedia.co.th

5.1. Check sv_factor values in kwg.KW_Variant.

  • Check variants (in variant_group 0) with sv_factor > 1. Be aware of following issues:
    • sv_factor should not be too high.
    • For variants which exist in multiple templates, the highest-priority region-type should get the highest sv_factor.
  • Check variants for Flights Routes template (t3008).
  • From experience, the highest sv_factor value of all variants should be only around 0.005.
  • If necessary, make change and re-run KWG to recalculate eSV. Update Keyword Priority afterwards.

5.2. Use up.Collection.url_priority_multiplier to adjust URL priorities.

  • Create an entry in up.Collection
  • Group URLs into up.URL_Collection with given collection_id
  • Set up url_priority_multiplier value. A value > 1 will boost URL priority. Values less than 1 but great than 0 will damp URL priority. Value 0 or 1 has no effect. Negative values are not allowed.
  • If a URL has multiple url_priority_multiplier values from different collections, the highest value will be used.
  • Rule of thumb is to apply the boost gradually (e.g. start with url_priority_multiplier = 1.5) but damper more (e.g. url_priority_multiplier = 0.01)

6. Populate up.URL_Priority_Value. URL priority (up.vw_URL_Priority) will be updated automatically.

exec [up].[usp_populate_url_priority_value];

History

Leave a Reply