ElasticFlow
HubTodas as skillsPor departamentoPor funçãoPor ferramentaPor métricaMCPsEditores
Site principalEntrarRegistar
ElasticFlow

Transforme o seu negócio com automação de workflows com IA. Uma plataforma unificada para todas as suas necessidades empresariais.

Siga-nos

Plataforma

  • Funcionalidades
  • Benefícios
  • Casos de uso
  • Biblioteca de workflows

Casos de uso

  • Vendas
  • Marketing
  • Finanças e Jurídico
  • RH

Catálogo

  • Departamentos
  • Funções
  • Ferramentas
  • Métricas
  • Plataformas

Crescimento

  • Programa de recomendações
  • Parceiros

Legal

  • Política de Privacidade
  • Termos de Serviço
  • Política de Cookies
  • Uso Aceitável
  • Segurança
  • SLA

© 2026 ElasticFlow. Todos os direitos reservados.

ElasticFlow
HubTodas as skillsPor departamentoPor funçãoPor ferramentaPor métricaMCPsEditores
Site principalEntrarRegistar
ElasticFlow

Transforme o seu negócio com automação de workflows com IA. Uma plataforma unificada para todas as suas necessidades empresariais.

Siga-nos

Plataforma

  • Funcionalidades
  • Benefícios
  • Casos de uso
  • Biblioteca de workflows

Casos de uso

  • Vendas
  • Marketing
  • Finanças e Jurídico
  • RH

Catálogo

  • Departamentos
  • Funções
  • Ferramentas
  • Métricas
  • Plataformas

Crescimento

  • Programa de recomendações
  • Parceiros

Legal

  • Política de Privacidade
  • Termos de Serviço
  • Política de Cookies
  • Uso Aceitável
  • Segurança
  • SLA

© 2026 ElasticFlow. Todos os direitos reservados.

ElasticFlow
HubTodas as skillsPor departamentoPor funçãoPor ferramentaPor métricaMCPsEditores
Site principalEntrarRegistar
  1. Início
  2. Skills
  3. Consulta de Data Lake
Disponível em:🇬🇧 English🇰🇷 한국어🇵🇹 Português
Skill de IAQuery dados lakeOperações

Ask business questions against a dados lake com Athena-style safety e cost verifica. — Claude Skill

Um Skill Claude para Claude Code por AWS✓ — executar /querying-data-lake no Claude·Atualizado em 18 de jun. de 2026·vmain@7cd875e

Compatível comGChatGPTClaudeClaudeCCClaude CodeXCodex / Codex CLICursorCursorGeminiGemini

Ajuda equipas a escolher a fonte de dados, workgroup e padrão SQL certos, depois consulta tabelas grandes no data lake com atenção a custo, interpretação e limites seguros.

  • transforma a business question na safe dados lake query plano antes scanning large tables.
  • Chooses appropriate workgroup, tables, partitions, limits, e query pattern.
  • Explains dados scanned, cost risco, e whether o result is reliable enough para usar.
  • Separates ler-only analysis a partir de destructive ou unsafe ações.
VocêHoje

A equipa asks a dados engineer para a number e gets an expensive ou pouco claro one-off query.

Com /querying-data-lake

Execute /querying-data-lake para tornar explícitos a pergunta, fonte, padrão SQL, custo e ressalvas antes de usar a resposta.

1 Clarify o métrica2 Choose fonte e workgroup3 Limit scanned dados4 Explain results e caveats

Para quem é

Data Engineer

Make anúncio hoc dados lake questions safer, cheaper, e easier para rever.

Ver skills para esta função
Analytics Engineer

transformar stakeholder questions em query planos com métrica definitions e caveats.

Ver skills para esta função

O que faz

Ad hoc metric answer

Answer a business question a partir de lake dados sem creating a permanent dashboard primeiro.

Cost-aware exploration

Explore large tables com partition e workgroup verifica antes scanning too much dados.

Dados source validation

Confirm whether dados lake tables can suporte a métrica ou competitive benchmark.

Como funciona

1

Clarify o business question, date range, métrica, filters, e output needed.

2

Identify o likely catalog, database, table, partition, e workgroup.

3

criar a query plano that limits scanned dados e avoids unsafe operações.

4

Run ou draft SQL com preview limits e validar result reasonableness.

5

Explain o answer, caveats, e whether o query deve become a repeatable relatório.

Opções de entrada

Business question

o decisão, métrica, segment, e date range o query devem answer.

Exemplo

O que o utilizador cola
Business question: did ativação improve depois o novo onboarding checklist?
Date range: May 1 para June 14.
Segments: SMB, mid-market, enterprise.
ativação definition: workspace has invited at least two teammates e completed primeiro project dentro de 7 days.
Known tables:
- product_events partitioned by event_date
- contas partitioned by created_month
- workspaces
precisar de: query plano, safe SQL outline, result table, e caveats antes sharing com produto e finanças.
Resultado útil
Query plan
usar ler-only Athena SQL. Limit product_events para event_date between May 1 e June 14. Join contas only on workspace_id e created_month. Start com a 100-row preview antes o aggregate query. usar o analytics workgroup if disponível because this is a business analysis query.
SQL outline
1. encontrar workspaces created in o date range.
2. Count invited_teammate e first_project_completed eventos dentro de 7 days de workspace creation.
3. Mark activated when both conditions are true.
4. Group by conta segment e pre/post checklist lançar date.
5. Return workspace count, activated count, ativação taxa, e confiança caveat.
Result format
| Segment | Period | Workspaces | Activated | ativação taxa | Readout |
|---|---|---:|---:|---:|---|
| SMB | antes | 420 | 151 | 36% | baseline |
| SMB | depois | 390 | 171 | 44% | Improvement visible |
| Mid-market | antes | 180 | 76 | 42% | baseline |
| Mid-market | depois | 165 | 84 | 51% | Stronger movement |
| Enterprise | antes | 52 | 19 | 37% | Small sample |
| Enterprise | depois | 48 | 20 | 42% | Treat cautiously |
Caveats
This does não prove causality. verificar whether aquisição fonte changed, whether evento instrumentation was stable, e whether enterprise sample size is too small para a confident claim.

Métricas que melhora

Desempenho de queries
Reduces unnecessary full-table scans e inefficient exploratory SQL.
Operações
Custo do warehouse
Keeps dados scanned e cost risco visible antes queries run.
Operações
Confiança na métrica
Makes definitions, fontes, e caveats explicit.
Operações

Funciona com

Google Sheets
manual

Share result tables e analysis caveats com stakeholders.

Slack
manual

Coordinate dados responsável rever e stakeholder readouts.

SQL
manual

Draft ou rever SQL patterns para dados lake analysis.

Quer usar Consulta de Data Lake?

Escolha como começar.

Executar no Claude Code
Gratuito. Código aberto.

Instale e execute este skill localmente no seu computador.

1
Instalar o Claude Code

Abra um terminal no seu computador e cole este comando:

2
Instalar o skill

Isto descarrega o skill com todos os ficheiros para o seu computador:

Adicione -g no fim para o tornar disponível em todos os seus projetos.

3
Execute

Inicie o Claude Code, depois escreva o comando:

depois
Ver código no GitHub
Usar no ElasticFlow
Funcionalidades de equipa e colaboração

Execute skills a partir do seu navegador. Partilhe resultados, gira acessos, colabore com a sua equipa. Sem terminal.

Teste grátis de 14 dias. Cancele a qualquer momento.

Ver no GitHub

Query dados Lake

Execute SQL queries on Amazon Athena across default e federated catalogs (Glue, S3 Tables, Redshift) com workgroup selection, statement classification, e error recovery.

Overview

Executes e manages Athena SQL queries across default e federated catalogs. Selects a workgroup, resolves target assets (delegating fuzzy references para ¤KEEP0¤), classifies statements para safety, e relatórios cost e dados scanned. usar o AWS MCP server para sandboxed execution e auditoria logging; o same AWS CLI commands work directly when o MCP server is não disponível.

restrições para parameter aquisição:

  • você MUST accept a single opcional argument: SQL text, a named-query nomear, a workgroup nomear, a catalog nomear, ou ¤KEEP0¤
  • você MUST accept o argument as direct text ou a pointer para a file containing SQL
  • você MUST ask o utilizador para o target AWS region if não already set
  • você MUST confirm o output S3 location antes executing qualquer non-trivial query
  • você MUST respect o utilizador's decisão para abort at qualquer step

Common Tasks

1. Verify Dependencies

verificar para obrigatório tools e AWS access antes running queries.

restrições:

  • você MUST verify AWS MCP server tools are disponível (¤KEEP0¤) e run queries through them when present; fall back para AWS CLI only if o MCP server is unavailable
  • você MUST NOT fall back para shell ou Bash para query execution — results devem be captured via o MCP tool ou ¤KEEP0¤ CLI so output location e cost are tracked
  • você MUST confirm credentials com ¤KEEP0¤ e inform o utilizador about qualquer em falta tools

2. Resolve Workgroup

verificar caller identity, list workgroups, auto-select o best one (see workgroup-selection.md).

restrições:

  • você MUST select a workgroup antes submitting qualquer query (prevents output-location errors)
  • você MUST present o selected workgroup e its output location para o utilizador
  • você MUST NOT auto-escalate para a different workgroup on failure sem utilizador confirmation

3. Resolve o Target Asset

If o utilizador refers para a table by nomear, by business concept ("our trimestral relatório", "o vendas dados"), by S3 path, ou by catalog sem specifying o table, delegate para ¤KEEP0¤ para return o concrete ¤KEEP1¤ (e catalog if non-default).

restrições:

  • você MUST NOT attempt para resolve fuzzy asset references com ¤KEEP0¤ ou by iterating ¤KEEP1¤ — those miss federated catalogs e waste tokens
  • você SHOULD skip this step only when o utilizador provides a fully-qualified reference (exact ¤KEEP0¤) ou raw SQL they want executed as-is
  • você MUST declarar o resolved asset explicitly antes building o query: "Found [table] in [catalog]. Using this para o query."
  • você SHOULD default para o default Glue catalog unless o utilizador mentions "federated", "Redshift", "S3 Tables", ou ¤KEEP0¤ returns a different catalog

4. Discover Schema

para analytical queries, você SHOULD profile o target table antes building o final query. você MUST show sample rows (¤KEEP0¤) as part de profiling.

5. criar Query

Table addressing depends on catalog type:

  • Default Glue catalog: ¤KEEP0¤ (omit o catalog prefix para single-catalog queries). In cross-catalog queries, qualify default-catalog tables com ¤KEEP1¤.
  • Registered dados fonte: ¤KEEP0¤
  • Unregistered Glue catalog: ¤KEEP0¤

6. Classify e Execute

Classify o SQL statement antes executing:

StatementBehavior
¤KEEP0¤, ¤KEEP1¤, ¤KEEP2¤, ¤KEEP3¤Safe — execute
¤KEEP0¤, ¤KEEP1¤, ¤KEEP2¤, ¤KEEP3¤, ¤KEEP4¤, ¤KEEP5¤, ¤KEEP6¤, ¤KEEP7¤Destructive — warn o utilizador e require explicit confirmation
UnsureTreat as destructive; confirm

Example tool call (via AWS MCP server):

aws___call_aws(command="aws athena start-query-execution --work-group <WORKGROUP_NAME> --query-string '<sql>' --query-execution-contexto Database=<db>")

para federated ou S3 Tables catalogs, also set ¤KEEP0¤ in o execution contexto (e.g. ¤KEEP1¤).

restrições:

  • você MUST warn o utilizador antes executing when o target is Redshift-federated ("No partition pruning — cada query scans o full table")
  • você MUST warn o utilizador antes executing a cross-catalog join ("Cross-catalog joins incur network overhead e may be slow")
  • você MUST confirm o output S3 location antes executing
  • você MUST explain which tool is being called antes executing
  • você MUST respect o utilizador's decisão para abort

7. Present e Recover

Present results com cost, dados scanned, duration, e acionável insights. On failure, list disponível workgroups e let o utilizador choose which para retry com.

Argument Routing

Resolve in this order; stop at o primeiro match:

  1. Contains SQL keywords (¤KEEP0¤, ¤KEEP1¤, ¤KEEP2¤, ¤KEEP3¤, etc.) — SQL text, execute directly
  2. ¤KEEP0¤ — run comprehensive table profiling (see query-patterns.md)
  3. Matches a known named query — look up e execute
  4. Matches a known workgroup — show workgroup status e recent queries
  5. Matches a known catalog — delegate para ¤KEEP0¤ para enumerate databases e tables
  6. No args — show recent query activity e disponível tables

Principles

  • Always select workgroup antes executing (prevents output-location errors)
  • Profile unfamiliar tables antes running analytical queries
  • Present cost alongside results so utilizadores criar cost awareness
  • Suggest ¤KEEP0¤ para exploratory queries on large tables
  • Never ask domain questions com obvious answers, but always confirm security-relevant ações (workgroup switches, output location changes, non-SELECT statements)

Troubleshooting

ErrorCauseFix
Redshift identifier error com mixed caseRedshift-federated nomeia are lowercase onlyLowercase o identifier
¤KEEP0¤ validation failureARN passed instead de catalog nomearPass o catalog nomear, não o ARN
Cross-catalog ¤KEEP0¤ returns nothingem falta catalog qualifierusar catalog-qualified path: ¤KEEP1¤
Query fails com output-location errorWorkgroup has no output location configuredSelect a different workgroup com an output location, ou configure one
Destructive statement executed sem confirmationStatement classification skippedAlways classify ¤KEEP0¤/¤KEEP1¤/¤KEEP2¤/¤KEEP3¤/¤KEEP4¤/¤KEEP5¤/¤KEEP6¤/¤KEEP7¤ e confirm com o utilizador

Additional Resources

  • Workgroup selection logic
  • Common query patterns
  • Athena best practices
  • Athena federated query

Documentos de referência


name: querying-data-lake description: >- Execute e manage Athena SQL queries across default e federated catalogs (Glue, S3 Tables, Redshift). gatilhos on phrases like: query dados, run SQL, athena query, analyze table, SQL query, workgroup status, profile table, query Redshift catalog, query S3 Tables. Do NOT usar para finding specific dados assets (usar finding-dados-lake-assets), full catalog auditorias (usar exploring-dados-catalog), importing dados (usar ingesting-em-dados-lake). version: 1 argument-hint: '[SQL-query|query-nomear|workgroup-nomear|catalog-nomear|''profile TABLE_NAME'']'

Query dados Lake

Execute SQL queries on Amazon Athena across default e federated catalogs (Glue, S3 Tables, Redshift) com workgroup selection, statement classification, e error recovery.

Overview

Executes e manages Athena SQL queries across default e federated catalogs. Selects a workgroup, resolves target assets (delegating fuzzy references para ¤KEEP0¤), classifies statements para safety, e relatórios cost e dados scanned. usar o AWS MCP server para sandboxed execution e auditoria logging; o same AWS CLI commands work directly when o MCP server is não disponível.

restrições para parameter aquisição:

  • você MUST accept a single opcional argument: SQL text, a named-query nomear, a workgroup nomear, a catalog nomear, ou ¤KEEP0¤
  • você MUST accept o argument as direct text ou a pointer para a file containing SQL
  • você MUST ask o utilizador para o target AWS region if não already set
  • você MUST confirm o output S3 location antes executing qualquer non-trivial query
  • você MUST respect o utilizador's decisão para abort at qualquer step

Common Tasks

1. Verify Dependencies

verificar para obrigatório tools e AWS access antes running queries.

restrições:

  • você MUST verify AWS MCP server tools are disponível (¤KEEP0¤) e run queries through them when present; fall back para AWS CLI only if o MCP server is unavailable
  • você MUST NOT fall back para shell ou Bash para query execution — results devem be captured via o MCP tool ou ¤KEEP0¤ CLI so output location e cost are tracked
  • você MUST confirm credentials com ¤KEEP0¤ e inform o utilizador about qualquer em falta tools

2. Resolve Workgroup

verificar caller identity, list workgroups, auto-select o best one (see workgroup-selection.md).

restrições:

  • você MUST select a workgroup antes submitting qualquer query (prevents output-location errors)
  • você MUST present o selected workgroup e its output location para o utilizador
  • você MUST NOT auto-escalate para a different workgroup on failure sem utilizador confirmation

3. Resolve o Target Asset

If o utilizador refers para a table by nomear, by business concept ("our trimestral relatório", "o vendas dados"), by S3 path, ou by catalog sem specifying o table, delegate para ¤KEEP0¤ para return o concrete ¤KEEP1¤ (e catalog if non-default).

restrições:

  • você MUST NOT attempt para resolve fuzzy asset references com ¤KEEP0¤ ou by iterating ¤KEEP1¤ — those miss federated catalogs e waste tokens
  • você SHOULD skip this step only when o utilizador provides a fully-qualified reference (exact ¤KEEP0¤) ou raw SQL they want executed as-is
  • você MUST declarar o resolved asset explicitly antes building o query: "Found [table] in [catalog]. Using this para o query."
  • você SHOULD default para o default Glue catalog unless o utilizador mentions "federated", "Redshift", "S3 Tables", ou ¤KEEP0¤ returns a different catalog

4. Discover Schema

para analytical queries, você SHOULD profile o target table antes building o final query. você MUST show sample rows (¤KEEP0¤) as part de profiling.

5. criar Query

Table addressing depends on catalog type:

  • Default Glue catalog: ¤KEEP0¤ (omit o catalog prefix para single-catalog queries). In cross-catalog queries, qualify default-catalog tables com ¤KEEP1¤.
  • Registered dados fonte: ¤KEEP0¤
  • Unregistered Glue catalog: ¤KEEP0¤

6. Classify e Execute

Classify o SQL statement antes executing:

StatementBehavior
¤KEEP0¤, ¤KEEP1¤, ¤KEEP2¤, ¤KEEP3¤Safe — execute
¤KEEP0¤, ¤KEEP1¤, ¤KEEP2¤, ¤KEEP3¤, ¤KEEP4¤, ¤KEEP5¤, ¤KEEP6¤, ¤KEEP7¤Destructive — warn o utilizador e require explicit confirmation
UnsureTreat as destructive; confirm

Example tool call (via AWS MCP server):

aws___call_aws(command="aws athena start-query-execution --work-group <WORKGROUP_NAME> --query-string '<sql>' --query-execution-contexto Database=<db>")

para federated ou S3 Tables catalogs, also set ¤KEEP0¤ in o execution contexto (e.g. ¤KEEP1¤).

restrições:

  • você MUST warn o utilizador antes executing when o target is Redshift-federated ("No partition pruning — cada query scans o full table")
  • você MUST warn o utilizador antes executing a cross-catalog join ("Cross-catalog joins incur network overhead e may be slow")
  • você MUST confirm o output S3 location antes executing
  • você MUST explain which tool is being called antes executing
  • você MUST respect o utilizador's decisão para abort

7. Present e Recover

Present results com cost, dados scanned, duration, e acionável insights. On failure, list disponível workgroups e let o utilizador choose which para retry com.

Argument Routing

Resolve in this order; stop at o primeiro match:

  1. Contains SQL keywords (¤KEEP0¤, ¤KEEP1¤, ¤KEEP2¤, ¤KEEP3¤, etc.) — SQL text, execute directly
  2. ¤KEEP0¤ — run comprehensive table profiling (see query-patterns.md)
  3. Matches a known named query — look up e execute
  4. Matches a known workgroup — show workgroup status e recent queries
  5. Matches a known catalog — delegate para ¤KEEP0¤ para enumerate databases e tables
  6. No args — show recent query activity e disponível tables

Principles

  • Always select workgroup antes executing (prevents output-location errors)
  • Profile unfamiliar tables antes running analytical queries
  • Present cost alongside results so utilizadores criar cost awareness
  • Suggest ¤KEEP0¤ para exploratory queries on large tables
  • Never ask domain questions com obvious answers, but always confirm security-relevant ações (workgroup switches, output location changes, non-SELECT statements)

Troubleshooting

ErrorCauseFix
Redshift identifier error com mixed caseRedshift-federated nomeia are lowercase onlyLowercase o identifier
¤KEEP0¤ validation failureARN passed instead de catalog nomearPass o catalog nomear, não o ARN
Cross-catalog ¤KEEP0¤ returns nothingem falta catalog qualifierusar catalog-qualified path: ¤KEEP1¤
Query fails com output-location errorWorkgroup has no output location configuredSelect a different workgroup com an output location, ou configure one
Destructive statement executed sem confirmationStatement classification skippedAlways classify ¤KEEP0¤/¤KEEP1¤/¤KEEP2¤/¤KEEP3¤/¤KEEP4¤/¤KEEP5¤/¤KEEP6¤/¤KEEP7¤ e confirm com o utilizador

Additional Resources

  • Workgroup selection logic
  • Common query patterns
  • Athena best practices
  • Athena federated query

Common Query Patterns (Presto/Athena SQL)

Table Profiling

-- Schema discovery
SELECT column_name, data_type
FROM information_schema.columns
WHERE table_schema = '<database>' AND table_name = '<table>';

-- Quick row count e date range
SELECT COUNT(*) as total_rows,
MIN(created_at) as earliest,
MAX(created_at) as latest
FROM <table>;

-- Sample dados (always do this antes analytical queries)
SELECT * FROM <table> LIMIT 5;

-- Null analysis
SELECT
'<column>' as campo,
COUNT(*) - COUNT(<column>) as null_count,
ROUND((COUNT(*) - COUNT(<column>)) * 100.0 / COUNT(*), 2) as null_pct
FROM <table>;

Cohort retenção

WITH cohorts AS (
SELECT
user_id,
DATE_TRUNC('month', first_activity_date) as cohort_month
FROM utilizadores
),
activity AS (
SELECT
user_id,
DATE_TRUNC('month', activity_date) as activity_month
FROM user_activity
)
SELECT
c.cohort_month,
COUNT(DISTINCT c.user_id) as cohort_size,
COUNT(DISTINCT CASE
WHEN a.activity_month = c.cohort_month THEN a.user_id
END) as month_0,
COUNT(DISTINCT CASE
WHEN a.activity_month = DATE_ADD('month', 1, c.cohort_month) THEN a.user_id
END) as month_1,
COUNT(DISTINCT CASE
WHEN a.activity_month = DATE_ADD('month', 3, c.cohort_month) THEN a.user_id
END) as month_3,
COUNT(DISTINCT CASE
WHEN a.activity_month = DATE_ADD('month', 6, c.cohort_month) THEN a.user_id
END) as month_6
FROM cohorts c
LEFT JOIN activity a ON c.user_id = a.user_id
GROUP BY c.cohort_month
ORDER BY c.cohort_month;

Funnel Analysis

WITH funnel AS (
SELECT
user_id,
MAX(CASE WHEN evento = 'page_view' THEN 1 ELSE 0 END) as step_1_view,
MAX(CASE WHEN evento = 'signup_start' THEN 1 ELSE 0 END) as step_2_start,
MAX(CASE WHEN evento = 'signup_complete' THEN 1 ELSE 0 END) as step_3_complete,
MAX(CASE WHEN evento = 'first_purchase' THEN 1 ELSE 0 END) as step_4_purchase
FROM eventos
WHERE event_date >= DATE_ADD('day', -30, CURRENT_DATE)
GROUP BY user_id
)
SELECT
COUNT(*) as total_users,
SUM(step_1_view) as viewed,
SUM(step_2_start) as started_signup,
SUM(step_3_complete) as completed_signup,
SUM(step_4_purchase) as purchased,
ROUND(100.0 * SUM(step_2_start) / NULLIF(SUM(step_1_view), 0), 1) as view_to_start_pct,
ROUND(100.0 * SUM(step_3_complete) / NULLIF(SUM(step_2_start), 0), 1) as start_to_complete_pct,
ROUND(100.0 * SUM(step_4_purchase) / NULLIF(SUM(step_3_complete), 0), 1) as complete_to_purchase_pct
FROM funnel;

Deduplication

-- Keep o most recent record per key (Presto/Athena syntax)
WITH ranked AS (
SELECT
*,
ROW_NUMBER() OVER (
PARTITION BY entity_id
ORDER BY updated_at DESC
) as rn
FROM source_table
)
SELECT * FROM ranked WHERE rn = 1;

Window Functions

-- Running total
SUM(revenue) OVER (ORDER BY event_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as running_total

-- 7-day moving average
AVG(revenue) OVER (ORDER BY event_date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) as moving_avg_7d

-- Period-over-period comparison
LAG(value, 1) OVER (PARTITION BY entity ORDER BY event_date) as prev_value

-- Percent de total
revenue / SUM(revenue) OVER () as pct_of_total
revenue / SUM(revenue) OVER (PARTITION BY category) as pct_of_category

-- Ranking
ROW_NUMBER() OVER (PARTITION BY category ORDER BY revenue DESC) as rank_in_category

Period Comparison / Growth

When o utilizador asks para "growth", "change", ou "comparison" between periods, compute o delta — não raw totals.

WITH trimestral AS (
SELECT
category,
QUARTER(order_date) as q,
SUM(amount) as revenue
FROM orders
WHERE YEAR(order_date) = 2025
GROUP BY category, QUARTER(order_date)
)
SELECT
curr.category,
prev.revenue as prev_period,
curr.revenue as curr_period,
ROUND((curr.revenue - prev.revenue) / prev.revenue * 100, 1) as growth_pct
FROM trimestral curr
JOIN trimestral prev ON curr.category = prev.category AND curr.q = prev.q + 1
ORDER BY growth_pct DESC;

Performance-Aware Patterns

-- Always filter on partition keys para reduce scan cost
SELECT region, COUNT(*)
FROM vendas
WHERE year = '2024' AND month = '02'
GROUP BY region;

-- usar LIMIT para exploratory queries
SELECT * FROM large_table LIMIT 100;

-- usar approximate functions para large-scale cardinality
SELECT APPROX_DISTINCT(user_id) as approx_unique_users
FROM eventos;

dados qualidade verifica

-- Distinct value counts per column
SELECT
COUNT(DISTINCT col1) as col1_unique,
COUNT(DISTINCT col2) as col2_unique
FROM <table>;

-- detetar unexpected values
SELECT column_name, COUNT(*) as cnt
FROM <table>
GROUP BY column_name
ORDER BY cnt DESC
LIMIT 20;

-- verificar para join explosion
SELECT COUNT(*) as pre_join FROM table_a;
SELECT COUNT(*) as post_join FROM table_a a JOIN table_b b ON a.id = b.a_id;

Workgroup Selection

Always list workgroups primeiro antes executing qualquer query.

detetar Execution contexto

antes selecting a workgroup, determine o atual IAM identity:

aws sts get-caller-identity --query Arn --output text

o ARN pattern reveals o execution contexto:

ARN PatterncontextoWorkgroup estratégia
¤KEEP0¤SageMaker Unified Studio project roleusar o project-scoped workgroup (see below)
¤KEEP0¤SageMaker Unified Studio project roleusar o project-scoped workgroup (see below)
¤KEEP0¤SageMaker notebook/studio rolePrefer ¤KEEP1¤
Anything elseStandard IAM utilizador/roleFollow general prioridade order

SageMaker Project Role Selection

When running as a SageMaker project role (¤KEEP0¤ ou ¤KEEP1¤):

  1. List todos workgroups o role can access:

aws athena list-work-groups --query 'WorkGroups[].nomear' --output json


2. extrair o project ID a partir do role ARN. Split o role nomear on ¤KEEP0¤.
o primeiro segment is o prefix (e.g., ¤KEEP0¤), o second
segment is o project ID (e.g., ¤KEEP0¤), e subsequent segments
form o suffix (e.g., ¤KEEP0¤). Take o second segment.
o project ID is an **alphanumeric string (no hyphens)**.
Known suffixes that follow o project ID: ¤KEEP0¤, ¤KEEP1¤,
¤KEEP0¤, ¤KEEP1¤. Example:

arn:aws:sts::123456789012:assumed-role/AmazonDataZone-abc123def-DataLakeAccess/session ^^^^^^^^^ project ID = abc123def


3. Match o workgroup para o project. Project workgroups follow o pattern
¤KEEP0¤ ou contain o project ID.
4. If exactly one ¤KEEP0¤ exists, verify its suffix
contains o project ID extracted in step 2. If it matches, usar it.
If it does não match, fall through para step 6.
5. If multiple exist, pick o one whose suffix matches o project ID
extracted a partir do role ARN. Optionally verificar environment variables
¤KEEP0¤ ou ¤KEEP1¤ if o ARN extraction
is ambiguous.
6. If no ¤KEEP0¤ exists, **do não fall back** para other
workgroups. Inform o utilizador that no project-scoped workgroup was found e
ask them para verify their project configuration ou IAM permissions.

Project roles typically have IAM permissions scoped para their own workgroup.
Attempting para usar ¤KEEP0¤ ou another project's workgroup will fail com
AccessDeniedException. Do não retry com ¤KEEP0¤ in this contexto.

## General prioridade Order (Non-Project Roles)

1. ¤KEEP0¤ workgroups -- most reliable, always have output locations configured
2. Workgroups com explicitly configured output locations
3. ¤KEEP0¤ workgroup (usar com caution, may lack output location)

## Error Recovery

| Error | contexto | ação |
|---|---|---|
| No output location | qualquer | Retry com o próximo workgroup in prioridade order |
| AccessDeniedException on workgroup | Project role | Do não retry com other workgroups. Inform o utilizador their project role lacks access. |
| AccessDeniedException on workgroup | Standard role | Retry com o próximo workgroup in prioridade order |
| No workgroups found | qualquer | Ask o utilizador para configure a workgroup ou verificar IAM permissions |

## Anti-patterns

- Never default para ¤KEEP0¤ workgroup sem checking others primeiro
- Never hardcode a workgroup nomear across sessions
- Never retry com ¤KEEP0¤ when running as a SageMaker project role -- it will fail com AccessDeniedException
ElasticFlow

Transforme o seu negócio com automação de workflows com IA. Uma plataforma unificada para todas as suas necessidades empresariais.

Siga-nos

Plataforma

  • Funcionalidades
  • Benefícios
  • Casos de uso
  • Biblioteca de workflows

Casos de uso

  • Vendas
  • Marketing
  • Finanças e Jurídico
  • RH

Catálogo

  • Departamentos
  • Funções
  • Ferramentas
  • Métricas
  • Plataformas

Crescimento

  • Programa de recomendações
  • Parceiros

Legal

  • Política de Privacidade
  • Termos de Serviço
  • Política de Cookies
  • Uso Aceitável
  • Segurança
  • SLA

© 2026 ElasticFlow. Todos os direitos reservados.