This commit is contained in:
Dipanshu Kumar
2026-05-19 11:58:33 +05:30
+105
View File
@@ -0,0 +1,105 @@
Recommended relationship model
1. Use a dimensional/star schema
Create:
DimStore
DimEmployee
DimProduct / SKU
DimPromotionDefinition
DimVisibilityDefinition
DimVisibilityReason
DimDisplay
DimSalesTerritory / territory hierarchy
DimDate
maybe DimChannel / DimChain / DimStoreType if you need clean lookup values
2. Map the fact tables to those dimensions
Store dimension
Use Store_Master.store_id as the primary store key.
Fact tables referencing store:
Sales.StoreId
Promotion.store_id
Mapping_StorePromotion.StoreId
Mapping_StoreVisibility.StoreId
Journey_Plan.store_id
Contact Conversion.store_id
PaidVisibility.store_id
PaidVisibility_Compliance.store_id
Coverage.store_id
additional_visibility.store_id
Employee dimension
Use Employee_Master.employee_id.
Fact tables referencing employee:
Sales.EmpID
Promotion.employee_id
PaidVisibility.employee_id
PaidVisibility_Compliance.employee_id
OQaD.employee_id
Journey_Plan.employee_id
Attendance.employee_id
Login.employee_id
Contact Conversion.emp_id
additional_visibility.emp_id
Product/SKU dimension
Use SKU Master.product_id (or pk if you want a warehouse surrogate).
Sales.ProductId should join to SKU Master.product_id
Visibility dimension
Use Master_VisibilityDefinition.VisibilityDefinitionid
Mapping_StoreVisibility.VisibilityDefinitionid
PaidVisibility.Visibility_definition_id
PaidVisibility_Compliance.visibility_definition_id
Reason dimension
Use Master_VisibilityReason.ReasonId
Promotion.ReasonId
PaidVisibility.ReasonId
coverage_remarks.reason_id
Promotion mapping
Likely relationship:
Mapping_StorePromotion.PromotionDefinitionid → Master_PromotionDefinition
Promotion.promo_definition_id → Master_PromotionDefinition
Display mapping
display_master.display_id → additional_visibility.display_id
Territory hierarchy
Master_SalesTerritory and Master_Salesterritorylayer are hierarchical masters
join them via matching StLayerOneId … StLayerFourId and project_id
enrich Store_Master with territory hierarchy
Best practices for ClickHouse and Generative BI
Use a curated warehouse model, not the raw SQL Server layout
Keep raw source tables as landing tables.
Build cleaned dimension tables and cleaned facts in ClickHouse.
Do not rely on ClickHouse to enforce FKs—use ETL validation and metadata.
Standardize keys and naming
Normalize EmpID / employee_id / emp_id to a single warehouse key.
Normalize StoreId / store_id / Unique_Store_ID.
Normalize ProductId / product_id, channel_id, chain_id, storetype_id.
Choose consistent types
Many facts use int while masters use bigint; choose one type in the warehouse and convert consistently.
Prefer UInt32 or UInt64 in ClickHouse based on value range.
Partition and sort facts by date
Use visit_date, login_date, audit_date as the partitioning key for fact tables.
ORDER BY should include the join keys used often in queries, for example:
ORDER BY (project_id, store_id, visit_date)
or ORDER BY (project_id, employee_id, visit_date)
Keep dimension tables narrow
Dimension tables like DimStore, DimEmployee, DimProduct, DimVisibilityDefinition, DimDisplay should be small and stable.
Fact tables should contain foreign keys to dims plus measures.
For Generative BI
A clean, consistent schema is critical.
Use descriptive dimension columns: store name, region, employee name, product name, visibility name, reason text, etc.
Avoid raw code-only facts; enrich them with lookup labels in ETL or views.
Practical next step
I recommend this immediate design:
DimStore(store_id, project_id, store_name, region, state, city, channel, distributor, store_type, ...)
DimEmployee(employee_id, project_id, employee_name, manager_id, role, channel_id, ...)
DimProduct(product_id, category, brand, product_name, mrp, ... )
DimVisibility(VisibilityDefinitionid, VisibilityDefinitionName)
DimReason(ReasonId, Reason)
FactSales(StoreId, EmpID, ProductId, ChannelId, VisitDate, Sale, Value, ...)
FactPromotion(store_id, employee_id, promo_definition_id, visit_date, promotion_status, ...)
FactPaidVisibility(...), FactPaidVisibilityCompliance(...), FactCoverage(...), FactAttendance(...), FactLogin(...), FactJourneyPlan(...), FactContactConversion(...)