How to create an asset using REST API

The following article describes how Upload API V2 can be used:
https://docs.stylelabs.com/en-us/contenthub/4.0.x/content/integrations/rest-api/upload/upload-api-v2.html.
 
Let’s demonstate the usage of API using Powershell 7. Uploading flow consists of three steps:

Request an upload

First, an asset needs to be created. It can be done using the following code:

$xAuthToken = "enter your auth token guid here"
$hostName = "https://YourCHHostname"

$fileName = "Test.png"
$filePath = "C:\TempDownloads\Test.png"
$fileSize = 792

# create request

$createAssetBody = @{
file_name = $fileName
file_size = $fileSize
upload_configuration = @{
name = "AssetUploadConfiguration"
}
action = @{
name = "NewAsset"
}
} | ConvertTo-Json

$createUrl = $hostName + "/api/v2.0/upload"

$createResponse = Invoke-WebRequest -Uri $createUrl -Method 'POST' -Body $createAssetBody -H @{"x-auth-token" = $xAuthToken; "Content-Type" = "application/json"}

 
In the response, we will get the following data:

  1. Response body has upload_identifier and file_identifier
{
"upload_identifier": "uploadidentifier",
"file_identifier": "fileidentifier"
}

 

  1. Location response Header contains the URL where we need to upload the data.

Perform the upload

For uploading the file, the approach described in this article is used: https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.utility/invoke-webrequest?view=powershell-7.2#example-5--submit-a-multipart-form-data-file

# upload request

$uploadUrl = $createResponse.Headers.Location[0]

$fileStream = [System.IO.FileStream]::new($filePath, [System.IO.FileMode]::Open)
$fileHeader = [System.Net.Http.Headers.ContentDispositionHeaderValue]::new('form-data')
$fileHeader.Name = $fileName
$fileHeader.FileName = Split-Path -leaf $filePath
$fileContent = [System.Net.Http.StreamContent]::new($fileStream)
$fileContent.Headers.ContentDisposition = $fileHeader
$fileContent.Headers.ContentType = [System.Net.Http.Headers.MediaTypeHeaderValue]::Parse("text/plain")

$multipartContent = [System.Net.Http.MultipartFormDataContent]::new()
$multipartContent.Add($fileContent)

Invoke-WebRequest -Body $multipartContent -Method 'POST' -Uri $uploadUrl -Headers @{"x-auth-token" = $xAuthToken}

$fileStream.Close()

Finalize the upload

To finalize the request, we need to post the response body received on the “Request an upload” step:

#finalize request

$finalizeUrl = $hostName + "/api/v2.0/upload/finalize"

Invoke-WebRequest -Uri $finalizeUrl -Method 'POST' -Body $createResponse.Content -H @{"x-auth-token" = $xAuthToken; "Content-Type" = "application/json"}

Full script

$xAuthToken = "enter your auth token guid here"
$hostName = "https://YourCHHostname"

$fileName = "Test.png"
$filePath = "C:\TempDownloads\Test.png"
$fileSize = 792

# create request

$createAssetBody = @{
file_name = $fileName
file_size = $fileSize
upload_configuration = @{
name = "AssetUploadConfiguration"
}
action = @{
name = "NewAsset"
}
} | ConvertTo-Json

$createUrl = $hostName + "/api/v2.0/upload"

$createResponse = Invoke-WebRequest -Uri $createUrl -Method 'POST' -Body $createAssetBody -H @{"x-auth-token" = $xAuthToken; "Content-Type" = "application/json"}

# upload request

$uploadUrl = $createResponse.Headers.Location[0]

$fileStream = [System.IO.FileStream]::new($filePath, [System.IO.FileMode]::Open)
$fileHeader = [System.Net.Http.Headers.ContentDispositionHeaderValue]::new('form-data')
$fileHeader.Name = $fileName
$fileHeader.FileName = Split-Path -leaf $filePath
$fileContent = [System.Net.Http.StreamContent]::new($fileStream)
$fileContent.Headers.ContentDisposition = $fileHeader
$fileContent.Headers.ContentType = [System.Net.Http.Headers.MediaTypeHeaderValue]::Parse("text/plain")

$multipartContent = [System.Net.Http.MultipartFormDataContent]::new()
$multipartContent.Add($fileContent)

Invoke-WebRequest -Body $multipartContent -Method 'POST' -Uri $uploadUrl -Headers @{"x-auth-token" = $xAuthToken}

$fileStream.Close()

#finalize request

$finalizeUrl = $hostName + "/api/v2.0/upload/finalize"

Invoke-WebRequest -Uri $finalizeUrl -Method 'POST' -Body $createResponse.Content -H @{"x-auth-token" = $xAuthToken; "Content-Type" = "application/json"}

Creating and updating public links via REST API in Sitecore Content Hub

To create or update a public link you need to have X-Auth-Token header set for all requests which you send. It can be obtained as described here: https://docs.stylelabs.com/contenthub/4.1.x/content/integrations/rest-api/authenticate/get-token.html

Send a POST request to /api/entitydefinitions/M.PublicLink/entities URL with the following body:

{
"properties": {
"RelativeUrl": "067baf76e8eb4ef2a6ece6fe611013a0",
"Resource": "downloadOriginal",
"ExpirationDate": "2021-10-30T12:16:00.176Z",
"ConversionConfiguration": {}
},
"is_root_taxonomy_item": false,
"is_path_root": false,
"inherits_security": true,
"entitydefinition": {
"href": "https://YourCHHostnameGoesHere/api/entitydefinitions/M.PublicLink"
},
"relations": {
"AssetToPublicLink": {
"parents": [
{
"href": "https://YourCHHostnameGoesHere/api/entities/IDOFYOURASSET"
}
]
}
}
}

 
If the operation is successful, 201 response code must be received and the body must have id and identifier of the created entity. RelativeUrl must be a unique random guid.

Send GET request to /api/entities/PublicLinkEntityId and check the properties.Status value in the returned response. When it is Completed, the link is ready.

Send a PUT request to api/entities/PublicLinkEntityId URL with the body which has modifications to the properties which needs to be updated.

For example, changing the ExpirationDate can be done as follows:

{
"properties": {
"ExpirationDate": "2021-11-30T12:16:00.176Z"
},
"entitydefinition": {
"href": "https://YourCHHostnameGoesHere/api/entitydefinitions/M.PublicLink"
}
}

 
entitydefinition property is mandatory and must be present in the request.

Using .foreach in windbg to find array entities sizes

There is an array of objects and the goal is to find the biggest ones. DumpArray outputs the record addresses which we may use:

0:014> !DumpArray /d 000001d837b4d860
Name: Sitecore.Xdb.Collection.Model.ContactDataRecord[]
MethodTable: 00007ffb94f03888
EEClass: 00007ffbf15d56b0
Size: 83312(0x14570) bytes
Array: Rank 1, Number of elements 10411, Type CLASS
Element Methodtable: 00007ffb948360f0
[0] 000001d8537b06c0
[1] 000001d8537c9e38
[2] 000001d8537ca220
[3] 000001d8537cddd8
[4] 000001d8537ce060
...

 
To iterate the addresses, .foreach command can be used:
https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/-foreach
 
Let’s first write the command which will foreach the !DumpArray command output:

0:014> .foreach (record { !DumpArray /d 000001d837b4d860 }) { .echo ${record} }
Name:
Sitecore.Xdb.Collection.Model.ContactDataRecord[]
MethodTable:
00007ffb94f03888
EEClass:
00007ffbf15d56b0
Size:
83312(0x14570)
bytes
Array:
Rank
1,
Number
of
elements
10411,
Type
CLASS
Element
Methodtable:
00007ffb948360f0
[0]
000001d8537b06c0
[1]
000001d8537c9e38
[2]
000001d8537ca220
[3]
000001d8537cddd8
[4]
000001d8537ce060
...

 
In detail, the script does the following:

  1. Parses the output of the command. The resulting output is an array of strings. record is the variable name of the array element.
(record { !DumpArray /d 000001d837b4d860 })

 
2. Executes command for each element in the array. ${record} can be used to reference the variable value:

{ .echo ${record} }

 
Now our goal is to remove unnecessary lines. It can be done using /pS and /ps parameters.

/pS skips first n records of the array. In our case, we don’t need lines which go before [0] inclusive. There are 22 such records. Parameters use hexadeximal format, so instead of 22 we need to use 16 (0x16).

0:014> .foreach /pS 16 (record { !DumpArray /d 000001d837b4d860 }) { .echo ${record} }
000001d8537b06c0
[1]
000001d8537c9e38
[2]
000001d8537ca220
[3]
000001d8537cddd8
[4]
000001d8537ce060
...

 
Looks better, but we still have [n] values. Using /ps parameter, it is possible to take every n-th element in the array. In our case we want to take odd lines. Using 1 for /ps we get only addresses.

0:014> .foreach /pS 16 /ps 1 (record { !DumpArray /d 000001d837b4d860 }) { .echo ${record} }
000001d8537b06c0
000001d8537c9e38
000001d8537ca220
000001d8537cddd8
000001d8537ce060
...

 
And finally, let’s change the .echo command to !objsize to get the size of each record

0:014> .foreach /pS 16 /ps 1 (record { !DumpArray /d 000001d837b4d860 }) { !objsize ${record} }
sizeof(000001d8537b06c0) = 256288 (0x3e920) bytes (Sitecore.Xdb.Collection.Model.ContactDataRecord)
sizeof(000001d8537c9e38) = 163792 (0x27fd0) bytes (Sitecore.Xdb.Collection.Model.ContactDataRecord)
sizeof(000001d8537ca220) = 329792 (0x50840) bytes (Sitecore.Xdb.Collection.Model.ContactDataRecord)
sizeof(000001d8537cddd8) = 90336 (0x160e0) bytes (Sitecore.Xdb.Collection.Model.ContactDataRecord)
sizeof(000001d8537ce060) = 330552 (0x50b38) bytes (Sitecore.Xdb.Collection.Model.ContactDataRecord)

How to get the SOLR query generated by xConnect search and replay it

Sometimes there is a need to check what query is generated by xConnect search and passed to SOLR.

This article describes how the query can be found and replayed.

First of all, debug mode needs to be enabled in SOLR admin panel. It can be done as shown on the picture below:

After this change, the executed queries will be written to the SOLR log.

Here is an example of such query:

2019-09-06 09:05:44.148 DEBUG (qtp1330278544-196) [   x:sc901_xdb] o.a.s.s.s.LocalStatsCache ## GET {q=(x_type_s:(ContactDataRecord)+AND+_query_:("\{\!type%3Djoin+from%3Dcontactid_s+to%3Did\}\(x_type_s\:\(InteractionDataRecord\)+AND+_query_\:\(\"\\\{\\\!type%3Dparent+which%3Dx_type_s\\\:\\\(InteractionDataRecord\\\)\\\}\\\(path_s\\\:\\\(Events\\\)+AND+\\\(\\\(@odata.type_s\\\:\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.UnsubscribedFromEmailEvent\\\)+OR+\\\(@odata.type_s\\\:\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.SpamComplaintEvent\\\)+OR+\\\(@odata.type_s\\\:\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.EmailSentEvent\\\)+OR+\\\(@odata.type_s\\\:\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.EmailOpenedEvent\\\)+OR+\\\(@odata.type_s\\\:\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.EmailClickedEvent\\\)+OR+\\\(@odata.type_s\\\:\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.DispatchFailedEvent\\\)+OR+\\\(@odata.type_s\\\:\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.BounceEvent\\\)+OR+@odata.type_s\\\:\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.EmailEvent\\\)\\\)\\\)\\\)\\\)\\\)\\\)\\\)+AND+\\\(definitionid_s\\\:\\\(2a65acc5985140dd851b23f7a6c53092\\\)+AND+messageid_s\\\:\\\(c7b8358d3fea4138b2b15d457bf4d2b6\\\)\\\)\\\)\\\)\"\)\)"))&df=_text_&echoParams=explicit&fl=id&cursorMark=*&json={"query":"(x_type_s:(ContactDataRecord)+AND+_query_:(\"\\{\\!type%3Djoin+from%3Dcontactid_s+to%3Did\\}\\(x_type_s\\:\\(InteractionDataRecord\\)+AND+_query_\\:\\(\\\"\\\\\\{\\\\\\!type%3Dparent+which%3Dx_type_s\\\\\\:\\\\\\(InteractionDataRecord\\\\\\)\\\\\\}\\\\\\(path_s\\\\\\:\\\\\\(Events\\\\\\)+AND+\\\\\\(\\\\\\(@odata.type_s\\\\\\:\\\\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.UnsubscribedFromEmailEvent\\\\\\)+OR+\\\\\\(@odata.type_s\\\\\\:\\\\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.SpamComplaintEvent\\\\\\)+OR+\\\\\\(@odata.type_s\\\\\\:\\\\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.EmailSentEvent\\\\\\)+OR+\\\\\\(@odata.type_s\\\\\\:\\\\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.EmailOpenedEvent\\\\\\)+OR+\\\\\\(@odata.type_s\\\\\\:\\\\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.EmailClickedEvent\\\\\\)+OR+\\\\\\(@odata.type_s\\\\\\:\\\\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.DispatchFailedEvent\\\\\\)+OR+\\\\\\(@odata.type_s\\\\\\:\\\\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.BounceEvent\\\\\\)+OR+@odata.type_s\\\\\\:\\\\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.EmailEvent\\\\\\)\\\\\\)\\\\\\)\\\\\\)\\\\\\)\\\\\\)\\\\\\)\\\\\\)+AND+\\\\\\(definitionid_s\\\\\\:\\\\\\(2a65acc5985140dd851b23f7a6c53092\\\\\\)+AND+messageid_s\\\\\\:\\\\\\(c7b8358d3fea4138b2b15d457bf4d2b6\\\\\\)\\\\\\)\\\\\\)\\\\\\)\\\"\\)\\)\"))","sort":"id+asc"}&sort=id+asc&rows=0&wt=json}

If you are lucky and the query is not very complicated, the “q” parameter can be just copied and replayed on SOLR manually. However, with the query above, SOLR won’t be able to parse the passed data.


I perform the following actions to make the query working and more readable:

  1. Retrieve only “q” parameter value (everything which goes from “q=” to “&”).

The result will be the following:

(x_type_s:(ContactDataRecord)+AND+_query_:("\{\!type%3Djoin+from%3Dcontactid_s+to%3Did\}\(x_type_s\:\(InteractionDataRecord\)+AND+_query_\:\(\"\\\{\\\!type%3Dparent+which%3Dx_type_s\\\:\\\(InteractionDataRecord\\\)\\\}\\\(path_s\\\:\\\(Events\\\)+AND+\\\(\\\(@odata.type_s\\\:\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.UnsubscribedFromEmailEvent\\\)+OR+\\\(@odata.type_s\\\:\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.SpamComplaintEvent\\\)+OR+\\\(@odata.type_s\\\:\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.EmailSentEvent\\\)+OR+\\\(@odata.type_s\\\:\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.EmailOpenedEvent\\\)+OR+\\\(@odata.type_s\\\:\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.EmailClickedEvent\\\)+OR+\\\(@odata.type_s\\\:\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.DispatchFailedEvent\\\)+OR+\\\(@odata.type_s\\\:\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.BounceEvent\\\)+OR+@odata.type_s\\\:\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.EmailEvent\\\)\\\)\\\)\\\)\\\)\\\)\\\)\\\)+AND+\\\(definitionid_s\\\:\\\(2a65acc5985140dd851b23f7a6c53092\\\)+AND+messageid_s\\\:\\\(c7b8358d3fea4138b2b15d457bf4d2b6\\\)\\\)\\\)\\\)\"\)\)"))
  1. Decode the URL using some online decoder (for example: https://meyerweb.com/eric/tools/dencoder/)

    The result will be the following:

    (x_type_s:(ContactDataRecord) AND _query_:("\{\!type=join from=contactid_s to=id\}\(x_type_s\:\(InteractionDataRecord\) AND _query_\:\(\"\\\{\\\!type=parent which=x_type_s\\\:\\\(InteractionDataRecord\\\)\\\}\\\(path_s\\\:\\\(Events\\\) AND \\\(\\\(@odata.type_s\\\:\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.UnsubscribedFromEmailEvent\\\) OR \\\(@odata.type_s\\\:\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.SpamComplaintEvent\\\) OR \\\(@odata.type_s\\\:\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.EmailSentEvent\\\) OR \\\(@odata.type_s\\\:\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.EmailOpenedEvent\\\) OR \\\(@odata.type_s\\\:\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.EmailClickedEvent\\\) OR \\\(@odata.type_s\\\:\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.DispatchFailedEvent\\\) OR \\\(@odata.type_s\\\:\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.BounceEvent\\\) OR @odata.type_s\\\:\\\(#Sitecore.EmailCampaign.Model.XConnect.Events.EmailEvent\\\)\\\)\\\)\\\)\\\)\\\)\\\)\\\) AND \\\(definitionid_s\\\:\\\(2a65acc5985140dd851b23f7a6c53092\\\) AND messageid_s\\\:\\\(c7b8358d3fea4138b2b15d457bf4d2b6\\\)\\\)\\\)\\\)\"\)\)"))
  2. Remove the backslashes:

    (x_type_s:(ContactDataRecord) AND _query_:("{!type=join from=contactid_s to=id}(x_type_s:(InteractionDataRecord) AND _query_:("{!type=parent which=x_type_s:(InteractionDataRecord)}(path_s:(Events) AND ((@odata.type_s:(#Sitecore.EmailCampaign.Model.XConnect.Events.UnsubscribedFromEmailEvent) OR (@odata.type_s:(#Sitecore.EmailCampaign.Model.XConnect.Events.SpamComplaintEvent) OR (@odata.type_s:(#Sitecore.EmailCampaign.Model.XConnect.Events.EmailSentEvent) OR (@odata.type_s:(#Sitecore.EmailCampaign.Model.XConnect.Events.EmailOpenedEvent) OR (@odata.type_s:(#Sitecore.EmailCampaign.Model.XConnect.Events.EmailClickedEvent) OR (@odata.type_s:(#Sitecore.EmailCampaign.Model.XConnect.Events.DispatchFailedEvent) OR (@odata.type_s:(#Sitecore.EmailCampaign.Model.XConnect.Events.BounceEvent) OR @odata.type_s:(#Sitecore.EmailCampaign.Model.XConnect.Events.EmailEvent)))))))) AND (definitionid_s:(2a65acc5985140dd851b23f7a6c53092) AND messageid_s:(c7b8358d3fea4138b2b15d457bf4d2b6))))"))"))

    Now the query is readable and can be analyzed. If you want to replay it as well, extra modifications are required.

  3. Hashes can’t be parsed, so you need to replace them with the corresponding symbol code: %23

    (x_type_s:(ContactDataRecord) AND _query_:("{!type=join from=contactid_s to=id}(x_type_s:(InteractionDataRecord) AND _query_:("{!type=parent which=x_type_s:(InteractionDataRecord)}(path_s:(Events) AND ((@odata.type_s:(%23Sitecore.EmailCampaign.Model.XConnect.Events.UnsubscribedFromEmailEvent) OR (@odata.type_s:(%23Sitecore.EmailCampaign.Model.XConnect.Events.SpamComplaintEvent) OR (@odata.type_s:(%23Sitecore.EmailCampaign.Model.XConnect.Events.EmailSentEvent) OR (@odata.type_s:(%23Sitecore.EmailCampaign.Model.XConnect.Events.EmailOpenedEvent) OR (@odata.type_s:(%23Sitecore.EmailCampaign.Model.XConnect.Events.EmailClickedEvent) OR (@odata.type_s:(%23Sitecore.EmailCampaign.Model.XConnect.Events.DispatchFailedEvent) OR (@odata.type_s:(%23Sitecore.EmailCampaign.Model.XConnect.Events.BounceEvent) OR @odata.type_s:(%23Sitecore.EmailCampaign.Model.XConnect.Events.EmailEvent)))))))) AND (definitionid_s:(2a65acc5985140dd851b23f7a6c53092) AND messageid_s:(c7b8358d3fea4138b2b15d457bf4d2b6))))"))"))
  4. The query which I used as example has _query_ element, which must have value in quotes. The nested query also has its own inner _query_ element, which must be in quotes as well. As a result, the inner quotes must be properly encoded, otherwise the query will be messed up. I used %5C%22 to replace the inner quotes, which corresponds to \“ symbols:

    (x_type_s:(ContactDataRecord) AND _query_:("{!type=join from=contactid_s to=id}(x_type_s:(InteractionDataRecord) AND _query_:( %5C%22{!type=parent which=x_type_s:(InteractionDataRecord)}(path_s:(Events) AND ((@odata.type_s:(%23Sitecore.EmailCampaign.Model.XConnect.Events.UnsubscribedFromEmailEvent) OR (@odata.type_s:(%23Sitecore.EmailCampaign.Model.XConnect.Events.SpamComplaintEvent) OR (@odata.type_s:(%23Sitecore.EmailCampaign.Model.XConnect.Events.EmailSentEvent) OR (@odata.type_s:(%23Sitecore.EmailCampaign.Model.XConnect.Events.EmailOpenedEvent) OR (@odata.type_s:(%23Sitecore.EmailCampaign.Model.XConnect.Events.EmailClickedEvent) OR (@odata.type_s:(%23Sitecore.EmailCampaign.Model.XConnect.Events.DispatchFailedEvent) OR (@odata.type_s:(%23Sitecore.EmailCampaign.Model.XConnect.Events.BounceEvent) OR @odata.type_s:(%23Sitecore.EmailCampaign.Model.XConnect.Events.EmailEvent)))))))) AND (definitionid_s:(2a65acc5985140dd851b23f7a6c53092) AND messageid_s:(c7b8358d3fea4138b2b15d457bf4d2b6)))) %5C%22))"))
  5. The query above can be executed against SOLR. It can be done by requesting the following URL in browser:

    https://localhost:8984/solr/sc901_xdb/select?indent=on&q=TheQueryGoesHere

    Where sc901_xdb is the name of the core, and the found query is passed using q parameter.

How Experience Analytics Reduce functionality works

During the five years working with Experience Analytics, I have never had a chance to deep dive into segment reducing functionality.
Several days ago, the time has finally came and I had to find out how it works. I decided to describe it in detail, possibly somebody will find this interesting.

I used 8.2.7 version for demonstration, though the functionality should be pretty much the same in previous versions and revisions of 8.x.

Demo Data

I created a small example to demonstrate the reduce manager work.
There is Campaign Group 1 campaign group, which has 3 campaigns with the following names:
Campaign1
Campaign2
Campaign3

The report for the Campaign Group 1 group looks as follows:

Experience Analytics uses 5 tables to retrieve the data in Sitecore 8.2: SegmentRecords, Fact_SegmentMetrics, SegmentRecordsReduced, Fact_SegmentMetricsReduced, DimensionKeys.
Here is approximate query which can help to understand the exact way of how records are stored (we consider that the reducer has not been executed yet):

SELECT Date, Visits, Value, DimensionKey
FROM [SegmentRecords] JOIN [Fact_SegmentMetrics]
ON [SegmentRecords].SegmentRecordId = [Fact_SegmentMetrics].SegmentRecordId
JOIN [DimensionKeys] ON [SegmentRecords].DimensionKeyId = [DimensionKeys].DimensionKeyId
WHERE SegmentId = '7A9A483F-195D-4F96-AD88-473CD6854C4F'

The SegmentRecordsReduced, Fact_SegmentMetricsReduced tables are not used for now, since they are populated when reducing is performed.
The query gives the following results:

The first ID in the DimensionKey is the campaign group id. The second id is the id of the campaign. Thus, we have the same results as in the report:
156e78f3-f4ea-43d1-8607-07c38ccc53a9 - 13 times
4ba98edb-b73c-4951-a5bc-6f4210651a3a - 2 times
f4701287-86e2-4534-8358-5701cdb5c7ee - 1 time.

Reduce functionality

Reducer compresses the data which is considered insignificant by the system. It allows having less records in the database, which makes the query execution fast even for the big databases.

All reducing related configuration can be found in the following file:

App_Config\Include\ExperienceAnalytics\Sitecore.ExperienceAnalytics.Reduce.config

There is a reduceLoader hook, which makes ReduceAgent to be executed every 30 seconds.

<agent type="Sitecore.ExperienceAnalytics.Reduce.ReduceAgent, Sitecore.ExperienceAnalytics.Reduce" >
<param desc="connectionStringName">reporting</param>
<param desc="triggerHour">1</param>
<param desc="logger" ref="experienceAnalytics/reduce/logger"/>
<param ref="experienceAnalytics/reduce/manager"/>
</agent>

Interesting parameter here is the triggerHour, which defines that the ReduceAgent can do it work only at this particular hour, which is 1AM by default. If the agent wakes up at other hour, it will just do nothing.
Apart from that, the reduce agent can do its work only once in 24 hours. The value of the last execution is stored in the properties table of the reporting database:

SELECT * FROM [Properties]
WHERE [Key] = 'EA_reduce_lastrun'

The agent delegates the execution to ReduceManager:

<manager type="Sitecore.ExperienceAnalytics.Reduce.ReduceManager, Sitecore.ExperienceAnalytics.Reduce">
<param desc="connectionStringName">reporting</param>
<param desc="retentionDays">7</param>
<param desc="logger" ref="experienceAnalytics/reduce/logger"/>
</manager>

The retentionDays parameter defines how many days the aggregated data remains untouched by reduce manager.

ReduceManager has specific time for which it is allowed to run. By default, it is 1 hour and can be configured using the following setting:

<setting name="ExperienceAnalytics.Reduce.Timeout" value="01:00:00" />

If the execution time is exceeded, the operation is aborted.

Reduce manager is executed for each site from the SiteNames table of the reporting database and for each segment. There are several checkups in the code which may prevent segment reducing. Mostly, this can happen if re-aggregation is in process or if there is no data for the particular segment which is older than 7 days. Corresponding messages are written to a log file with Info level.

The main reducing logic is performed by the ReduceSegmentMetrics stored procedure. First of all, the data for further reducing is retrieved. It is done using the query like follows:

SELECT ROW_NUMBER() OVER(ORDER BY sm.Visits DESC, ABS(sm.Value) DESC) AS 'PredicateOrder',
sr.[SegmentId],
sr.[Date],
sr.[SiteNameId],
sr.[DimensionKeyId],
sm.[SegmentRecordId],
sm.[ContactTransitionType],
sm.[Visits],
sm.[Value],
sm.[Bounces],
sm.[Conversions],
sm.[TimeOnSite],
sm.[Pageviews],
sm.[Count]
FROM SegmentRecords sr
INNER JOIN Fact_SegmentMetrics sm ON sr.SegmentRecordId = sm.SegmentRecordId
INNER JOIN DimensionKeys dk ON dk.DimensionKeyId = sr.DimensionKeyId
WHERE sr.SegmentId = '7A9A483F-195D-4F96-AD88-473CD6854C4F' AND sr.[Date] >= '2019-08-16 00:00:00' AND
sr.[Date] < '2019-08-17 00:00:00'

The resulting table in our case will be:

The records in the table are ordered by number of visits and engagement value and each row has PredicateOrder number assigned. If the number of records is too big, only first n records remain and the other ones are reduced. The PredicateOrder helps to understand which records to leave.
By default, only first 1000 records are significant ones and the rest is reduced. It can be controlled using the following setting:

<setting name="ExperienceAnalytics.Reduce.DefaultKeepCountThreshold" value="1000" />

In our case, this setting does not affect the behavior since there are only 4 records for the segment.
The resulting relation is filtered further using Visits and Value metrics. By default only records with Visits > 10 are considered
significant. As for value, by default ABS(Value) must be > -1 (which is always true) to be significant.
The corresponding default values can be changed in the configuration:

<setting name="ExperienceAnalytics.Reduce.DefaultValueThreshold" value="-1" />
<setting name="ExperienceAnalytics.Reduce.DefaultVisitThreshold" value="10" />

The change will affect all segments.

Also each particular segment can be configured independently. To do that, the dimension which corresponds to a segment must be found first. It can be done as follows:

  1. Find the needed segment in the master database using Content Editor.
  2. The segment parent item is the needed dimension.
  3. Use the dimension id to find it in the Sitecore.ExperienceAnalytics.Reduce.config file. Using the visitThreshold and valueThreshold attributes it is possible to define the required thresholds.
<dimension id="{3E01BA28-2B4D-408A-A4BA-6C51ED9FFB9C}" type="Sitecore.ExperienceAnalytics.Aggregation.Dimensions.ByCampaign, Sitecore.ExperienceAnalytics.Aggregation" visitThreshold="1" valueThreshold="-1"/>

In the configuration above, the records which have more that 1 visit and any value are considered as significant.


In my case (with default configuration) only the first record is significant. The rest of the records will be reduced. Value, Visits, Bounces, etc. will be summed up and only one reduced record with [Other] DimensionKey will be stored in Fact_SegmentMetricsReduced and SegmentRecordsReduced tables (note that initially Fact_SegmentMetrics and SegmentRecords tables). The reduced data will be purged from the Fact_SegmentMetrics and SegmentRecords tables after stored procedure work is finished.
Running the query against Fact_SegmentMetrics and SegmentRecords tables will return no results now. If the same query is executed against Fact_SegmentMetricsReduced and SegmentRecordsReduced tables, the result will be the following:

SELECT Date, Visits, Value, DimensionKey
FROM [SegmentRecordsReduced] JOIN [Fact_SegmentMetricsReduced]
ON [SegmentRecordsReduced].SegmentRecordId = [Fact_SegmentMetricsReduced].SegmentRecordId
JOIN [DimensionKeys] ON [SegmentRecordsReduced].DimensionKeyId = [DimensionKeys].DimensionKeyId
WHERE SegmentId = '7A9A483F-195D-4F96-AD88-473CD6854C4F'

As it can be seen, non-significant data is stored with [Other] dimension key, thus can’t be used for displaying particular campaigns in the reports. The report shows the following data after the reducer finished its work:

Summary

Considering the above, we now know that Experience Analytics reports may show less data after some time. Using the configuration settings this behavior can be adjusted to meet the requirements:

  1. retentionDays value can be increased. In this case, the data will remain unchanged for longer time.
  2. Reduce functionality can be disabled completely by disabling the Sitecore.ExperienceAnalytics.Reduce.config file. However, please note that this approach may lead to reporting database grows and as a result the queries will be executed slowly if the amount of data is too big.
  3. Specific segments can be reconfigured using the visitThreshold and valueThreshold attributes as it was described above.
  4. It is also possible to comment out specific dimension under the experienceAnalytics/reduce/dimensions node, however it is not the way which EA treats as valid one. If the dimension is present under the experienceAnalytics/aggregation/dimensions node it should be present under the experienceAnalytics/reduce/dimensions node as well. Otherwise, the following error will appear in a log:
ERROR [Experience Analytics]: Error trying to reduce segment: ea364d4d-b85c-4c07-88d1-51edcaa1a160, date: 8/16/2019, site: website

Though, the error is not critical and does not seem to break something.

How to address "Definition not found" issue during aggregation

Issue description

The following exception is quite popular and can be found in the log files when the processing speed is low:

165164 14:04:56 WARN  Failed to process interaction with Id 2e58c56c-d6c7-452b-a1d1-2b068c15372c and ContactId {F9BE0608-E18C-4A35-B2C2-EA3C5226C107}. Processing will be retried later.
Exception: System.InvalidOperationException
Message: Definition not found: itemId: '{38AAA41B-E509-4083-B876-799B809F13BA}' culture: 'Invariant Language (Invariant Country)' type: 'Sitecore.Marketing.Definitions.Goals.IGoalDefinition'
Source: Sitecore.ExperienceAnalytics.Aggregation
at Sitecore.ExperienceAnalytics.Aggregation.Pipeline.SegmentProcessor.ProcessSegments(AggregationPipelineArgs args, IEnumerable`1 segments)
at Sitecore.ExperienceAnalytics.Aggregation.Pipeline.SegmentProcessor.OnProcess(AggregationPipelineArgs args)
at Sitecore.Analytics.Aggregation.Pipeline.AggregationProcessor.Process(AggregationPipelineArgs args)
at (Object , Object )
at Sitecore.Pipelines.CorePipeline.Run(PipelineArgs args)
at Sitecore.Pipelines.DefaultCorePipelineManager.Run(String pipelineName, PipelineArgs args, String pipelineDomain, Boolean failIfNotExists)
at Sitecore.Pipelines.DefaultCorePipelineManager.Run(String pipelineName, PipelineArgs args, String pipelineDomain)
at Sitecore.Analytics.Aggregation.Pipeline.AggregationPipeline.Run(AggregationPipelineArgs args)
at Sitecore.Analytics.Aggregation.InteractionBatchAggregator.Aggregate(IVisitAggregationContext interaction, InteractionAggregationType interactionAggregationType, RebuildTargets targets)
at Sitecore.Analytics.Aggregation.InteractionBatchAggregator.Aggregate(ItemBatch`1 batch, InteractionAggregationType interactionAggregationType, RebuildTargets targets)

Nested Exception

Exception: System.InvalidOperationException
Message: Definition not found: itemId: '{38AAA41B-E509-4083-B876-799B809F13BA}' culture: 'Invariant Language (Invariant Country)' type: 'Sitecore.Marketing.Definitions.Goals.IGoalDefinition'
Source: Sitecore.ExperienceAnalytics.Core
...

As it is seen from the stacktrace and the message, the exception is thrown during aggregation when converted goal can’t be found.
The processing pool will contain the records with attempts > 0 in this case:
pool


As a result, each record with missing goal will be tried to be processed 10 times. Even with small amount of data, this will slow down the aggregation process significantly.
There are two main sources of the marketing definition items: 1. Master database is the main marketing definitions storage (source of trust). Definitions are retrieved from the master database when standolone instance is used. 2. Reporting database is the definitions storage, which is used for aggregation if dedicated processing server is in place.

To address the issue, the following actions can be performed:

  1. Try to find the item which corresponds to the missing item ID in the master database. If the item is present in the master database, the issue is most likely happens due to the fact that the definition is missing in the reporting database. Redeploying the definitions using the Control Panel must solve the issue.
  2. In case if the item is missing, everything becomes a bit more difficult. The situation is the following: there is data in the collection database, which was collected for a goal which was removed.

What to do if the definition is missing in the master database?

The collection database stores the id of the goal item and during the aggregation, it should be retrieved from the definitions storage to get addition info (like assets).
Thus, to overcome the exceptions, it is possible to recreate the goals with the same ids as stored in the collection database. Here is the approach which I used:

  1. Backup the solution.
  2. For each problematic id from the log files like 38AAA41B-E509-4083-B876-799B809F13BA here:
    Definition not found: itemId: ‘**{38AAA41B-E509-4083-B876-799B809F13BA}’* culture: ‘Invariant Language (Invariant Country)’ type: ‘Sitecore.Marketing.Definitions.Goals.IGoalDefinition’*

    Ensure that the item does not exist and create a new goal item under /sitecore/system/Marketing Control Panel/Goals item. Do not deploy it. You can name it the way you want.
  3. Copy the created goal id.
  4. Now, we need to replace this id in the master database with the one from the log files. It can be done using the following SQL code executed against Master database:
    declare @itemId uniqueidentifier = '{F451B31C-71D2-4433-B449-C9886E9A09EC}' -- Your created goal id goes here.
    declare @oldItemId uniqueidentifier = '{38AAA41B-E509-4083-B876-799B809F13BA}' -- id from the log files goes here.

    UPDATE [dbo].SharedFields
    SET ItemId = @oldItemId
    WHERE ItemId = @itemId

    UPDATE [dbo].VersionedFields
    SET ItemId = @oldItemId
    WHERE ItemId = @itemId

    UPDATE [dbo].Items
    SET ID = @oldItemId
    WHERE ID = @itemId

Before running the update scripts, it is better to ensure that the item does not exist in DB:

SELECT *
FROM [dbo].Items where ID = @oldItemId

SELECT *
FROM [dbo].SharedFields where ItemId = @oldItemId

SELECT *
FROM [dbo].VersionedFields where ItemId = @oldItemId*


Since Sitecore has number of caching layers, to have the relevant data in the Content Editor, I recommend stopping the Sitecore instance when performing the SQL operation.
5. Start Sitecore instance and check that the created goals have correct IDs now (if you do not stop the process during running the scripts, you may get old IDs since they will be taken from cache).
6. Deploy the goals.
7. Check if the exceptions disappeared from the log files for the recreated goal ids.


The approach was checked for 8.2.7, should work for 9.x in the same way. The same behavior may be experienced for other marketing definitions (like campaigns). The approach is the same, the only difference is the definition item which needs to be created on the 2nd step.


P.S. When completing this article, I though about using ItemManager to create the item with corresponding ID. Did not check it, though I believe it should work as well.


P.P.S Do not remove the marketing definitions once created!

How to inject logging to Marketing Automation condition

The best way to understand why something does not work is debugging it. Unfortunately, it is not always possible to attach VS or windbg to a running process. In such cases, I usually extend the code with logging and check the output.


Some time ago I had issues with non-working Marketing Automation condition. This was custom condition and I had code for it. After some time monitoring the MA pool without success, I decided to inject logging into the condition Evaluate method to troubleshoot it. The conditions are processed by maengine service, thus usual Sitecore Log.Info() call won’t work here since there is no Sitecore.Kernel assembly reference. Using DI and passing the ILogger via condition constructor won’t work either since the conditions are created using Activator and require parameter-less constructor.


MA has quite elegant extension points, which allow us to get the ILogger from condition. The Evaluate method gets IRuleExecutionContext as a parameter, the interface has only one method Fact which is used for getting contact, interaction and other required information. We will use it to get the needed ILogger implementation. I will use test FirstNameCondition further to demonstrate the behavior.


using Serilog;
using Sitecore.Framework.Rules;
using Sitecore.XConnect;
using Sitecore.XConnect.Collection.Model;

namespace TestCondition
{
public class FirstNameCondition : ICondition
{
public bool Evaluate(IRuleExecutionContext context)
{
var logger = context.Fact<ILogger>();
logger.Information("Condition was executed");
var contact = context.Fact<Contact>();
return contact.Personal()?.FirstName == "Sergey";
}
}
}

By default, the ILogger object is not present in context and we need to register it. It can be done by overriding the PopulateScopedFacts method of the ConditionEvaluationService.

public class ConditionEvaluationService : Sitecore.Xdb.MarketingAutomation.Rules.ConditionEvaluationService
{
private readonly ILogger<Sitecore.Xdb.MarketingAutomation.Rules.ConditionEvaluationService> _logger;

public ConditionEvaluationService(
ILogger<Sitecore.Xdb.MarketingAutomation.Rules.ConditionEvaluationService> logger,
IConditionSerializer serializer, IServiceProvider serviceProvider) : base(logger, serializer,
serviceProvider)
{
_logger = logger;
}

public ConditionEvaluationService(
ILogger<Sitecore.Xdb.MarketingAutomation.Rules.ConditionEvaluationService> logger,
IConditionSerializer serializer, IConditionCache cache, IServiceProvider serviceProvider) : base(logger,
serializer, cache, serviceProvider)
{
_logger = logger;
}

protected override void PopulateScopedFacts(FactSpecificier facts, IContactProcessingContext processingContext,
IConditionServices conditionServices, ISegmentationServiceContext segmentationServiceContext)
{
base.PopulateScopedFacts(facts, processingContext, conditionServices, segmentationServiceContext);
facts.Fact<ILogger>(_logger);
}
}

The corresponding service must be replaced by a custom one in the following file:

App_data\jobs\continuous\AutomationEngine\App_Data\Config\sitecore\MarketingAutomation\sc.MarketingAutomation.ConditionEvaluationService.xml
<MarketingAutomation.Rules.ConditionEvaluationService>
<Type>CustomNamespace.ConditionEvaluationService, CustomAssembly</Type>
<As>Sitecore.Xdb.MarketingAutomation.Core.Rules.IConditionEvaluationService, Sitecore.Xdb.MarketingAutomation.Core</As>
<LifeTime>Singleton</LifeTime>
</MarketingAutomation.Rules.ConditionEvaluationService>