SQL.RU
 client/server technologies
 Главная | Документация | Статьи | Книги | Форум | Блоги | Опросы | Гостевая | Рассылка | Работа | Поиск | FAQ |
   
#364<<  #365 (заключительный)

Новое лицо веб-сайта SQL Server на Microsoft.Ru

29 июня на сайте Microsoft Россия вышла новая версия раздела SQL Server - http://www.microsoft.com/Rus/sql/. Новый дизайн, новая подборка материалов, русифицированные разделы "Технологии", "Решения" - всего более 40 обновленных и совершенно новых страниц. В ближайших планах: пополнение сайта русскоязычными материалами и статьями, выпуск русского раздела по SQL Server 2008, обновление разделов "Партнеры" и "Примеры внедрений", где в удобной форме будут представлены российские решения на SQL Server. Также на следующей неделе у Microsoft Россия должны появится еще два новых сайта: обновленный BizTalk Server (http://www.microsoft.com/Rus/biztalk/) и совершенно новый сайт Microsoft Business Intelligence (http://www.microsoft.com/Rus/bi/). Всем добро пожаловать! Отзывы о новых веб-сайтах можно присылать на мой адрес - akhromov@microsoft.com c темой "Feedback Web".

Андрей Хромов,
маркетинг менеджер Майкрософт Россия по продуктам SQL Server и BizTalk

СОДЕРЖАНИЕ

1.СТАТЬИ
1.1.Введение в распараллеливание исполнения запроса
1.2.10 способов оптимизации полнотекстового индексирования в SQL Server (окончание)
2.ССЫЛКИ НА СТАТЬИ
2.1.Статьи на русском языке
2.2.Англоязычные статьи
3.ФОРУМ SQL.RU
3.1.Самые популярные темы недели
3.2.Вопросы остались без ответа
4.Благодарности
 


СТАТЬИ

Введение в распараллеливание исполнения запроса

По материалам статьи Craig Freedman: Introduction to Parallel Query Execution

SQL Server умеет выполнять запросы одновременно на нескольких процессорах. Такую возможность принято называть параллельным исполнением запроса. Параллельное исполнение запроса может использоваться для сокращения времени отклика (то есть, повышение быстродействия) больших запросов. Оно также может использоваться и при исполнении больших запросов (которые обрабатывают большой объём данных) в одно и то же время с маленькими запросами (масштабирование), увеличивая число процессоров, используемых в обслуживании запроса. Для большинства больших запросов SQL Server масштабируется практически линейно или почти линейно. Повышение быстродействия тут означает, что если мы удваиваем число процессоров, мы можем наблюдать сокращение времени отклика тоже в два раза. Масштабирование тут означает, что если мы удваиваем число процессоров и размер запроса, мы получает то же самое время отклика.

Когда параллелизм полезен?

Как я уже отметил выше, параллелизм может использоваться для сокращения времени отклика одного запроса. Однако, параллелизм влияет на стоимость: она увеличивается за счёт увеличения накладных расходов на исполнение запроса. Несмотря на то, что эти накладные расходы невелики, параллелизм не желателен для маленьких запросов (особенно для OLTP -запросов), для которых такой "довесок" будет соразмерим с полным временем исполнения. Кроме того, если мы сравним время исполнения одного и того же маленького запроса в режиме параллельного исполнения на двух процессорах или последовательного исполнения (без параллелизма, на одном процессоре), для таких запросов будет типично более длительное исполнение распараллеленного запроса, причём, почти в два раза дольше. Опять же, это объясняется сравнительно большой долей накладных расходов на параллелизм для маленьких запросов.
В первую очередь параллелизм полезен для тех серверов, которые выполняют относительно небольшое число параллельных запросов. Для таких серверов, параллелизм может предоставить возможность маленькому числу запросов утилизировать большое число процессоров. Для серверов с большим числом параллельных запросов (например, OLTP), необходимость в параллелизме меньше, поскольку все процессоры и так будут утилизированы; просто потому, что у системы достаточно много для этого запросов. Распараллеливание этих запросов только добавило бы дополнительную нагрузку, что снизило бы общую производительность системы.

Как SQL Server распараллеливает запросы?

SQL Server распараллеливает запросы, выполняя горизонтальное секционирование входных даны, и число приблизительно равных секций устанавливается равным числу доступных процессоров, после чего для каждой секции выполняется одна и та же операция (например, агрегация, соединение и т.п.). Предположим, что мы хотим использовать два процессора для запроса с агрегатом хэша, который нужен для группировки по целочисленному столбцу. Мы создаем два потока (один для каждого процессора). Каждый поток выполняет один и тот же оператор агрегата хэша. Разделение входных данных можно организовать, например, на основе чётности или нечётности значений столбца группировки. Все принадлежащие одной группе и выделенные в один поток строки обрабатываются одним оператором агрегации хэша, и в итоге мы получаем правильный результат.

Этот метод параллельного выполнения запроса является и простым и хорошо масштабируемым. В показанном выше примере, оба потока агрегата хэша выполняются независимо. Этим двум потокам не нужно обмениваться сообщениями или координировать свою работу. Если нужно увеличить степень параллелизма (DOP), мы можем простой добавлять ещё потоков и откорректировать соответственно функцию секционирования (на практике для секционирования строк для агрегата хэша используется хеш-функция, которая пригодна для любого типа данных, любого числа групп столбцов, и любого числа потоков).
Реальное секционирование и перемещение данных между потоками выполняется итератором параллелизма (или ещё говорят - итератором обмена). Хотя этот итератор очень специфичен, он поддерживает те же интерфейсы, что любой другой итератор. Большинство итераторов не должны ничего знать о том, что они выполняются параллельно. Просто соответствующие итераторы параллелизма добавляются в план, и выполнение идёт параллельно.
Обратите внимание на то, что описанный выше метод параллелизма не одно и тот же, что и "конвейерный" параллелизм, когда несколько не связанных между собой операторов выполняются одновременно в разных потоках. Хотя SQL Server часто размещает разные операторы в разные потоки, первопричина в том, чтобы можно было снова выделять секции данных, когда они будут передаваться от одного оператора к следующему. С конвейером степень параллелизма и общее число потоков были бы ограничены числом операторов.

Кто решает, распараллеливать ли запрос?

Оптимизатор запросов принимает решение о необходимости распараллеливания исполнения запроса. Это решение, как и большинство других, основывается на стоимости. Сложный и дорогой запрос, который обрабатывает много строк, с большей вероятностью будет распараллелен, чем простой запрос, который обрабатывает очень мало строк.

Кто определяет степень параллелизма (DOP)?

DOP - не является частью кэшируемого откомпилированного плана исполнения запроса и может измениться при следующем исполнении. DOP определяется в начале исполнения, с учётом числа процессоров на сервере и установленных посредством sp_configure параметров глобальной конфигурации "max degree of parallelism" и "max worker threads" (они видны только если установлено значение "show advanced options"), и с учётом подсказки оптимизатору MAXDOP, если она используется. Короче говоря, DOP выбирается таким, чтобы получить параллелизм и не исчерпать число рабочих потоков. Если указан MAXDOP 1, все итераторы параллелизма будут из плана удалены и запрос выполняется по последовательному плану в одном потоке.
Обратите внимание, что число используемых параллельным запросом потоков может превысить DOP. Если при исполнении параллельного запроса отслеживать состояние sys.sysprocesses, можете увидеть большее чем DOP число потоков. Как я говорил выше, если снова выделять секции данных между двумя операторами, они будут помещаться в разные потоки. DOP определяет число потоков на оператор, а не общее число потоков на план исполнения запроса. В SQL Server 2000, если DOP был меньше числа процессоров, дополнительные потоки могли использовать оставшиеся процессоры, что фактически могло привести к отступлениям от заданных настроек MAXDOP. В SQL Server 2005, когда исполняется запрос с заданным DOP, также ограничивается и число планировщиков. То есть все потоки, используемые запросом, будут назначены тому же самому набору планировщиков, и запрос будет использовать только заданное DOP число процессоров независимо от общего числа потоков.

[В начало]

10 способов оптимизации полнотекстового индексирования в SQL Server

По материалам статьи Hilary Cotter: 10 Ways to Optimize SQL Server Full-text Indexing

4. Уменьшайте нагрузку на время популяции

Во время популяции не запускайте SQL Server Profiler, и максимально ограничьте любую другую деятельность с базой данных. Profiler отнимает существенные ресурсы. По этой же причине, не выполните показанный ниже запрос в непрерывном цикле, поскольку он негативно влияет на производительность:

SELECT FULLTEXTCATALOGPROPERTY('CatalogName', 'PopulateStatus')

5. Увеличьте число потоков для процесса индексации

Увеличивая число потоков, Вы балансируете нагрузку процесса индексации. Значение по умолчанию для потоков равно всего пяти, т.ч. на четырёх или восьмипроцессорных Вы можете смело использовать намного больше потоков. Однако, MS Search может "захлёбываться" на больших объёмах данных, так что избегите применять эту рекомендацию на одно - или двухпроцессорных системах. Вот необходимые для редактирования ключи системного реестра:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Search\1.0\Gathering Manager\Servers\SQLServer\EvaluationOrder HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Search\1.0\Gathering Manager\Servers\SQLServer\HitInterval

Ещё одной уловкой, если ваше приложение это допускает, является разделение данных на несколько каталогов. Например, если Вы разбиваете вашу таблицу на две, Вы будете иметь вдвое больше потоков для обслуживания индексации.

6. Вложите средства в самое лучшее оборудование, которое Вы только можете себе позволить

Приобретайте четырёх и более процессорные системы с самой высокой тактовой частотой и с большим кэшем второго уровня. Лучше использовать восемь процессоров. Если Вы не можете себе позволить восьмипроцессорный сервер, вложитесь в самую быструю дисковую подсистему, которую Вы только можете купить (Microsoft не поддерживает размещение баз данных SQL Server или каталогов полнотекстового поиска в памяти, на RAM - дисках).

7. Останавливайте работу антивирусного ПО или агентов резервирования файлов

Если их невозможно остановить, пробуйте отключить просмотр ими временных каталогов, используемых SQL FTI и каталогов самого каталога полнотекстового индекса, который по умолчанию помещается в этом месте:

C:\Program Files\Microsoft SQL Server\MSSQL\FTData.

8. Предоставьте каталогу собственный дисковый контроллер

Разместите каталог на отдельном контроллере, предпочтительно дискового массива RAID-1.

9. Размещайте временный каталог и фай подкачки на массиве RAID-1

Разместить временный каталог на массиве RAID-1 можно с помощью скрипта SetTempPath.vbs, который можно найти в папке:

C:\Program Files\Common Files\System\MSSearch\bin

То же самое стоит сделать и с файлом подкачки, который лучше поместить на собственном массив RAID-1 с отдельным контроллером дисковой подсистемы.

10. Будьте готовы к максимальной утилизации процессоров

Немецкий и (особенно) азиатские языки очень много ресурсов отвлекают на разделители слов, при этом утилизация процессоров зачастую достигает 100 процентов. Хотя такая ситуация сам по себе не является плохой, это может стать проблемой, если значения счётчика производительности Server Work Queues будут превышать двойку, что заставит ваш сервер заниматься распределением ресурсов и управлением очередью, а не полезной работой.

Примечание о BLOB

Web-серверы оптимизированы для обслуживания клиентских файлов, а РСУБД оптимизированы для возвращения клиенту результирующих наборов, так что Web-серверы показывают лучшую чем РСУБД производительность, когда возвращают клиентам большие двоичные объекты. Вообще, чем больше Вы переносите таких объектов из СУБД на ваш Web-сервер, тем лучше будет полная производительность базы данных. Поэтому, Вы можете рассмотреть возможность хранения копий ваших BLOB данных не только в базе данных, но и в виде файлов. Тогда, при создании ваших Web-страниц, Вы можете использовать ссылки не на BLOB, а на их версии в виде файлов, делая это вместо обращений к РСУБД (я понимаю, что место хранения большие двоичных объектов и больших текстовых данных всегда вызывало споры среди специалистов. В своей статье, Aaron Bertrand, который является SQL Server MVP, обсуждает некоторые из таких способов, а также приводит ссылки на статьи Microsoft KB, посвящённые работе с BLOB-данными http://www.aspfaq.com/show.asp?id=2149).

Примеры

Вы можете скачать два примера. Код первого примера иллюстрирует загрузку базы данных с файлами, расположенными в каталоге (существует несколько методов загрузки данных в базу, включая GetChunk и AppendChunk ADO, TextCopy и обычный bcp). Для загрузки я предпочитаю использовать объект потока в ADO. Пожалуйста используйте в кода первого примера загрузки базы данных объект потока. Код второго пример позволяет контролировать прогресс полной или возрастающей популяции, разрешая заглядывать в незавершенную популяцию, и строить предположения о завершённости процесса популяции (для этого, на короткое время запускается Profiler). Получаемая оценка основана на том факте, что в среднем 5 процентов процесса компоновки каталога затрачивается на компиляцию каталога. Реальное время, конечно, будет зависеть от числа слов и частоты их использования.

Загрузить файл примеров можно по этой ссылке: 406HILARY.ZIP

Полнотекстовый поиск в SQL Server 2005

В SQL Server 2005 обещали включить довольно много усовершенствований, связанных с SQL FTS. Я разбил их на несколько категорий:

  • Архитектура - наиболее значительное усовершенствование затрагивает производительность индексации и поиска; первые тесты показывают улучшение чуть ли не в два раза. Основная причина в том, что Microsoft интегрировал поисковый сервер в механизм базы данных. В SQL7 и SQL 2000 сервер поиска был отдельной службой, которая разделялась всеми экземплярами и не только SQL Server, а также и SharePoint Portal Server Search и Exchange Content Indexing. Индексы поиска также были перемещены в базу данных. Мало того, что это приводит к повышению производительности, это также позволяет копировать базу данных и полнотекстовые индексы одним модулем. В итоге: новый сервер поиска может индексировать до 2 миллиардов строк (пределом для SQL 2000 было от трёх до 30 миллионов строк).

  • Усовершенствования в T-SQL - поддержка поиска с тезаурусом, синтаксис создания полнотекстового индекса стал очень похожи на синтаксис команды CREATE INDEX, используемой для обычных таблиц. В предыдущих версиях SQL Search не мог хорошо обрабатывать мусорные слова или диакритические знаки. Новая версия SQL Search, которая поставляется с SQL 2005, решает эти проблемы. Также, Вы теперь сможете указывать один или несколько столбцов в полнотекстовом запросе.

  • Инструментарий отчётности - полнотекстовый поиск SQL 2005 обладает улучшенными средствами отчётности и журналирования. Поставляются два новых инструмента: Lrtest, который позволяет точно определять, как параметр поиска был разбит синтаксическим анализатором во время парсинга для запросе и индексации; и CiDump, который, позволяет делать дамп каталог, чтобы точно определять, что же содержится в полнотекстовом каталоге.

Между прочим, Yukon FTS поддерживает механизм XQuery, когда документы хранятся в XML, и использует для этого тип данных - XML.

Ресурсы

  • Andrew Cencini's white paper, "SQL Server 2005 Full-Text Search: Internals and Enhancements"

  • 323739 INF: SQL Server 2000 Full-Text Search Deployment White Paper

  • 827449 How to manually reinstall the Microsoft Search service for an instance of SQL Server 2000

  • 817301 PRB: Microsoft Search Full-Text Catalog Population Fails When You Upgrade SQL Server 2000 Failover Cluster to SQL Server 2000 SP3

  • 317746 PRB: SQL Server Full-Text Search Does Not Populate Catalogs

  • 811064 Support Webcast: Microsoft SQL Server 2000: Full-Text Search and SP3

  • 25136 Support Webcast: SQL Server 7.0 Full-Text Search and Basic Troubleshooting Methods

  • 308771 PRB: A Full-Text Search May Not Return Any Hits If It Fails to Index a File

  • 240681 HOWTO: Query Index Server for Custom Properties with SQL Query Analyzer

[В начало]

ССЫЛКИ НА СТАТЬИ

Статьи на русском языке

eLearning: What's New in Microsoft® SQL Server™ 2008
Константин Косинский
Среди eLearning'ов от Microsoft появился первый описывающий возможности новой версии SQL Server https://www.microsoftelearning.com/eLearning/offerDetail.aspx?offerPriceId=128041.

Документация по SQL Server 2008 уже доступна на вебе
Александр Гладченко
Майкрософт продолжает удивлять нас новыми способами знакомства специалистов с планируемыми к скорому выпуску версиями своих продуктов. Так, среди ресурсов MSDN2, мной был обнаружен сайт электронной документации по SQL Server 2008. Теперь о новшествах новой версии можно узнать из первых уст и для этого не нужно скачивать огромный дистрибутив BOL 2008.

Использование оператора OUTPUT для отслеживания изменений DML
Lyudmila Fokina
Знаете ли вы, что… Вы можете отслеживать изменения данных без использования триггеров: В SQL Server 2005 оператор OUTPUT является частью синтаксиса DML предложений.

Новый блог участника Russian SQL Server Club
Александр Гладченко
Сегодня появилось первое сообщение в техническом блоге Николая Денищенко: "Declarative Management Framework - размораживаем холодильник".

Declarative Management Framework - размораживаем холодильник
Nikolay Denishchenko
С тех пор как я поставил себе "Katmai" CTP3, меня не покидает крамольная мысль, что SQL Server DBA давно отчаялись выпрашивать у Деда Мороза новый BMW. Вместо этого они обратили свой взор на более реального wish-maker'а - Microsoft, завалив компанию просьбами о большой кнопке "Настроить всё" и "Большом Брате", который бы присматривал за подопечными серверами. Иначе как ещё можно объяснить появление Declarative Management Framework? Алексей Халяко уже успел рассказать об основных понятиях DMF, поэтому здесь я позволю себе небольшие рассуждения о том, чего следует ожидать от этой технологии, станут ли DBA цитировать Ренату Литвинову ("Я летаю, я в раю") и какие сценарии доступны для использования в CTP3.

При установке сервера SQL Server 2005 появляется сообщение об ошибке "Во время выполнения мастера установки произошла непредвиденная ошибка"
Microsoft
Ошибка №: 408784 (SQLBUDT). Проблема: При установке Microsoft SQL Server 2005 появляется следующее сообщение об ошибке: Во время выполнения мастера установки произошла непредвиденная ошибка. Обратитесь к журналу установки или нажмите кнопку "Справка" для получения дополнительных сведений.

Обновленная страница SQL Server
Anton Belousov
Сегодня в 9 утра у нас запустилась обновленная страница по SQL Server. Свежачок так сказать:) Не буду рассказывать долго о том, что там появилось. А сразу же приведу ссылку: http://www.microsoft.com/Rus/sql/ Ваши комментарии и пожелания по данной странице приветствуются. Они будут бязательно учтены:)

Право сисадмина ("пиратство глазами сисадмина" FAQ)
Ренат Минаждинов
Поручение использовать контрафакт должно быть каким-то образом подтверждено документально? Если мы говорим о докладной записке от системного администратора руководителю организации об использовании нелицензионного программного обеспечения, то не совсем верно говорить о поручении использовать контрафактное программное обеспечение. Ответ руководителя может быть не обязательно "поручаю системному администратору Иванову использовать нелицензионное программное обеспечение". Скорее даже, он таким не будет. Для системного администратора необходимо чтобы был документально зафиксирован сам факт его обращения к директору; то, что руководитель организации знал о факте использования контрафакта. Для этого докладную записку необходимо вручить генеральному директору под роспись, чтобы руководитель на копии (которая останется у системного администратора) поставил свою подпись. Вместе с тем, наличие такой записки с отметкой руководителя о получении не является панацеей от всех бед: она лишь доказывает то, что руководитель знал о нарушении закона и не даст руководителю переложить всю вину на системного администратора, сказав, что именно последний не поставил его в известность о нарушении закона. При этом системный администратор скорее всего понесет уголовную ответственность вместе с директором, но докладная записка поможет доказать, что системный администратор нарушил закон в силу материальной и служебной зависимости, что в соответствии с п."е" ч.1 ст.61 УК РФ является смягчающим наказание обстоятельством. Вообще же вопросы, связанные с привлечением к уголовной ответственности системного администратора, сообщившего руководителю об использовании на предприятии нелицензионного программного обеспечения, в настоящее время активно обсуждаются в профессиональном сообществе самих сисадминов, но не выносятся на обсуждение государственных (в первую очередь, правоохранительных) органов

CTP3: DMF. Продолжаем изучение. Группы политик и дачный сезон
Алексей Халяко
После того, как простой пример по запрещению написания неправильных слов на заборе окончился успехом, пришла в голову идея посмотреть, что же такое "пользовательские группы".

Доступен для загрузки: ADO.NET Entity Framework June 2007 CTP
Александр Гладченко
2 июля 2007 года на сайте загрузки Майкрософт стал доступен дистрибутив предварительной, обзорной версии ADO.NET Entity Framework, который будет использоваться при разработке в среде Visual Studio "Orcas". В этом выпуске били внесены изменения в Object Services, Query, Entity Client и Entity Data Model Wizard. Добавлены несколько новшеств:

Кластеризация изнутри, таинственные зависания, учетная запись SA и прочее
Nancy Michell
В. Мне необходимо разобраться с тем, как работает кластеризация. В нашей вычислительной среде будет использоваться 64-разрядный Windows Server® 2003, под управлением которого будет работать SQL Server™ 2005, веб-ферма служб отчетов сервера SQL (SSRS) (развертывание горизонтального масштабирования сервера отчетов), сервер каталогов SSRS TempDB и сервер SQLServer, получающий данные из базы данных сторонней компании через связанный сервер и предназначенный для хранения данных для SSRS. Нам требуется состоящий из 3 узлов кластер типа активный/активный/пассивный. Узел 1 будет активным, и на нем будут храниться данные, полученные из базы данных сторонней компании. Узел 2 будет активным, и на нем будет храниться каталог SSRS. Узел 3 будет пассивным и станет выполнять роль резервного для узла 1 или узла 2. Не могли бы вы помочь?

Устранение ошибок удаленного вызова процедур
Зубаир Александр
Если вы работали с серверными платформами Windows в течение нескольких лет, то, вероятно, время от времени встречали ошибки удаленного вызова процедур. Ошибки связаны с недоступностью сервера RPC, отсутствием доступных конечных точек или с иными непонятными предупреждениями. Если подобные сообщения приводят вас в замешательство, прочитайте статью. Я рассмотрю некоторые общие ошибки, различные методы для определения возникших ошибок RPC, а также покажу некоторые решения для разрешения определенных проблем. До начала рассмотрения определенных ошибок RPC и решений давайте разберемся с терминологией RPC.

Пересмотр понятия конвейера
Don Jones
О том, насколько необычной, новой и захватывающей является Windows PowerShell, было сказано много, несмотря на то, что эта среда основана на концепциях интерфейса командной строки, известных в течение десятилетий в основном в операционных системах на базе UNIX и Linux. Но распространенная терминология, которой совместно со своими потомками пользуется Windows PowerShell, может легко отвлечь от подлинной гибкости и уникальности среды Windows PowerShell™ и ее выдающейся приспособленности к среде Windows®.

SQL Server 2005: права доступа типа - Control
Людмила Фокина
Знаете ли Вы что… В SQL Server 2005 возможно дать пользователю права доступ типа Control к объекту (таблице, виду и т.д.), который фактически дает те же права, что и у собственника объекта.

10 Способов оптимизации Полнотекстовой индексации SQL Server
Hilary Cotter
В этой статье, Хилари Коттер описывает методики, с помощью которых вы сможете сделать полнотекстовую индексацию SQL-сервера(full-text indexing, или FTI) лучше и быстрее.

Работа с данными. Вчера, сегодня, завтра...
Serg Vorontsov
Не хотелось бы делать очередное глобальное сравнение двух различных технологий, двух различных подходов и что самое важное двух эпох в методах доступа к данным. Этими сравнениями пестрят все настоящие книги по ADO.Net. Но, к сожалению, для того, что бы действительно по настоящему понять, что по настоящему отличает технологии ADO и ADO.Net мне все же придется сделать краткий экскурс в эту область. И особенно это будет полезно для тех, кто имеет опыт разработок под ADO/OLE DB.

Работа с данными вчера, сегодня, завтра... DataReader
Serg Vorontsov
Прошло меньше недели с момента написания предыдущей статьи. Но как показывает статистика, интерес у наших посетителей к тому, что было сделано - есть. И это естественно вдохновляет меня на продолжение публикаций по данному разделу.

Работа с данными вчера, сегодня, завтра... Typed dataset
Serg Vorontsov
Typed dataset. Как много в этом звуке для сердца нашего слилось. Этими, чуть измененными, словами известного русского классика мне бы хотелось начать описание, наверное, самой крупной и интересной главы в механизме доступа к данным Ado.Net.

Работа с данными вчера, сегодня, завтра... Схема в Typed dataset
Serg Vorontsov
Итак, наконец-то переходим к третьей части. К статье, посвещенной теме: "Создание схем типизированных датасетов". В предыдущих, статьях мы достаточно подробно коснулись деталей, т.е. иными словами, "внутренностей" типизированных датасетов, ну а теперь пришло время рассмотреть систему в целом.

Работа с автономными данными в ADO.NET
Кондратьев Денис
Технология ADO.NET, в отличие от своих предшественников ADO и OLE DB, была разработана специально для использования в web приложениях, где не бывает постоянных соединений с БД. Традиционная работа с данными в ADO.NET строится по такой схеме: создается соединение Connection, затем оно открывается методом Open, создается объект команда Command, инкапсулирующая SQL команду, она исполняется, а соединение затем закрывается. Такой подход обеспечивает поточный доступ к результатам запросов. Т.е. читая данные с помощью DataReader, вы не можете перепрыгнуть через несколько записей или вернуться к предыдущей. Поточный доступ имеет максимальную производительность.

Готов июльский релиз SQL Server 2005 Best Practices Analyzer
Александр Гладченко
Я узнал эту новость сегодня из блога, который ведёт Glenn Berry. Удивительное дело, у Майкрософт становиться модным обозначать месяцем и годом не только CTP. Надеюсь, никому уже не нужно объяснять, что такое BPA и для чего его использовать :) Просто качайте и наслаждайтесь!

Сертификат для SSL шифрования трафика в SQL сервере
Ян Либерман
Процесс настройки SSL шифрования трафика в SQL сервере можно условно разделить на два этапа: получение сертификата и собственно настройка SQL сервера. Если последнее легко делается за 5 минут с помощью графического интерфейса, то с первым пунктом могут возникнуть проблемы. Где взять сертификат? Есть варианты:

[В начало]

Англоязычные статьи

SQL Server Full Text Search Optimization
Tony Bain
Full text indexing is a great feature that solves a long running database problem, the searching of textual data columns for specific words and phrases in SQL Server databases.

Full-text Search in Backup & Recovery
Kevin Kline
Are you a user of full-text search on SQL Server 2000? If so, you know that you must back up the full-text catalog separately from any database or transaction log backups that you perform.

Rico Mariani Starts LINQ to SQL Performance Analysis Series
Roger Jennings
Rico Mariani is a "Performance Preacher" in Microsoft's Developer Division and a co-author of the "Improving .NET Application Performance and Scalability" patterns & practices white paper. His Channel9 video, "Rico Mariani: Writing better, faster code," was the featured "Behind the Code" show segment earlier this year. He also starred in "Rico Mariani - Making .NET Perform (Happy Birthday Video #2)" produced in February 2005.

New Security Roles in CTP3
Patrick Husting
I'm working with Microsoft on a technical whitepaper and received some information from the PM on security about new Roles in M&A. Server roles apply to the whole Dashboard Designer installation and grant system-wide access to data and tasks. There are four types of server roles:

How to configure Database mirroring for use with ODBC applications?
Ramu Konidena
If you have an ODBC application that is connecting to a mirrored database and if

Fill factor value for Indexes
Decipherinfosys
Whenever you create a new index in SQL Server or rebuild an existing index, you can specify several optional parameters - one of such parameters is the fill factor. This parameter controls the percentage of free space in the leaf level of the index pages are filled when they are created. So, a fill factor value of 100% or 0% means that each index page is 100% full, a fill factor of 90% means that each index page is 90% full and 10% of each leaf level page will be empty thus providing space for future index expansion as DML statements occur. The default is 0% which is the same as 100%.

Transaction marks
Hilary Cotter
There seems to be some confusion out there in the internet about how to implement transaction marks. Or perhaps I was merely very confused about how to implement them. Transaction marks allow you to restore two or more database to a consistent point in time. For example if you have two database and your app writes to both databases, how do you restore them to a consistent point in time without kicking all your users off the databases? This is complicated by many things, for example backup speed. I googled looking for a working example but could not find one, so here is a script I wrote to demonstrate it.

Handy replication proc
Hilary Cotter
A friend of mine called me up last night. He had a problem replicating 300,000,000 transactions per day, or was it per hour. Can't recall, it was a late night. Anyhowze, with volumes like this replication might not be the best tool. His problem was that the distribution agent to two of his subscribers was hanging on initializing. From experience I know this is caused by depleted buffers on the publisher and the only solution I know of is a reboot. Naturally this was not an option for him.

SQL Full-text search anomalies
Hilary Cotter
On the newsgropes recently I have encountered some posters who are perplexed by the results of their full-text searches. The first poster was puzzled as to why searches on f.b.i. would not find F.B.I. in his content. SQL FTS and Microsoft Search products index uppercased tokens with periods in them differently than other lower or upper cased tokens. During indexing most language word breakers will index upper cased tokens with periods in them as the token, and then the token with all the periods stripped out. So F.B.I. is indexed as F.B.I. and FBI. All tokens are stored upper cased in the index with the exception of path names (for cosmetic reasons - e.g. in Indexing services a path like c:\TeMp is stored as c:\temp as lowercase was considered to look better than uppercase).

Reading large Profiler Trace files
Tony Rogerson
What was I thinking? I double clicked on a profiler trace file, only 5GB and up it started to come in Profiler; I got distracted (the kettle boiled) and came back and realised my machine was strugglging a bit - strange, its a good workstation - 4GB, fast AMD etc... Oh dear, then it dawned on me what I'd done, I was opening a 5GByte file in Profiler - it's trying to load it all into memory - lol - whoops.... What I should have done is this...

Row fragmentation (Hopscotch) - Heap v Clustered and IO cost
Tony Rogerson
Thankfully in my travels I see less and less heaps (a table without a clustered index), frankly they should be banned, taken out of the engine and a clustered index made mandatory. In this post I look at row hopscotch (row fragmentation) and the effects it has on application performance (IO's and concurrency), you also see stark evidence at just why you should always have a clustered index.

SQL Server 2008 new features - the list
Bob Beauchemin
There was a fairly well-known Powerpoint slide that attempted to summarize the new SQL Server 2005 features in bullet points of a single slide. By the release, there were so many new features, the feature list had to be rendered in a 5-point font to fit. At TechEd 2007, Microsoft presented the new features of SQL Server 2008 (was: SQL Server Katmai) in an analogous format. Although it's not yet down to a 5-point font, there are quite a few impressive new features on tap. Here's the list, modulo (my own) potential spelling errors. There is currently a CTP of SQL Server 2008 as well as beta1 of the next version of Visual Studio.NET (Orcas) and the ADO.NET synchronization framework. The features I can find in the current betas have asterisks.

Grouping sets and TABLIX - made for each other?
Bob Beauchemin
SQL Server 2008 will include GROUPING SETS; a (very) short explanation is that these allow the equivalent of multiple GROUP BY clauses in a single SQL statement. The result is a UNION ALL of the resultsets. SQL Server 2008 also contains/allows standard syntax for ROLLUP and CUBE, which have been in SQL Server for a while.

Table valued parameters in SQL Server 2008
Bob Beauchemin
The June CTP of SQL Server 2008 contains support for table-valued parameters. Here's a usage scenario for these that has been around for a while. Imagine you are running an online store and deal with (among other data) orders and order detail lines. You'd like to have a stored procedure that can add an entire order in one server round trip, regardless of the number of items that I order (that is, 1 order header and 1->n order detail lines). You'd even settle for two round trips, one for the order header, one for the details. Before SQL Server 2008, there is no built-in mechanism that supports this. In the past, I've seen some pretty interesting workarounds, such as:

A supported Service Broker external activator?
Bob Beauchemin
Had to write about another thing that "caught my ear" at TechEd during a chalktalk by Rick Negrin about Service Broker usage patterns. Service Broker supports "internal activation", that is, associate a stored procedure that gets invoked when a queue has messages to process as well as "external activiation". When using external activation, an event notification occurs when a queue has messages to process, and this notification is picked up by an external application; the external application processes the queue messages, out-of-process to SQL Server.

SQL Server system data types implemented in .NET\
Bob Beauchemin
SQL Server 2008 will contain, if my count is correct, 7 new data types. Note: none of these are in the current CTP.

ORDPATH, ORDPATH, everywhere
Bob Beauchemin
ORDPATH is a hierarchical labeling scheme used in the internal implementation of the XML data type in SQL Server 2005. It's meant to provide optimized representation of hierarchies, simplify insertion of nodes at arbitrary locations in a tree, and also provide document order. It's described in the academic whitepaper "ORDPATHs: Insert-Friendly XML Node Labels". In addition to being used internally when the XML data type is stored, its also part of the key of the PRIMARY XML INDEX, used to speed up XQuery. It's an implementation/specialization of the path enumeration model of representing hierarchies in relational databases, mentioned in Joe Celko's book "Trees and Hierarchies in SQL".

And the EAV winner is .... sparse columns
Bob Beauchemin
Many of you have already heard the "hardware store" story. What's the best way to model products in a hardware store, where new items arrive at the hardware store each day. Each item has a "short list" of similar properties (e.g. UPC, price) and a long list of dissimilar properties (e.g. paint has color, type, amount and curtain rods have width, metal, etc). How to model the dissimilar properties for each item in relational table(s)?

SqlCredit Part 6: Exploring EncryptByCert and DecryptByCert
Rob Garrison
This is part 6 of a series. If you have not read part 5, you can find it here. There are many good articles available that cover data encryption and data security in general. I will not try to explain why you need to encrypt sensitive data within your database. If you don't already have someone knocking on your door telling you to do this, listen carefully and you can hear them coming down the hall. For years, many database designers have played the performance card and gotten away with storing plain-text data inside the database. They (we) have created elaborate security schemes around the database, but if someone could actually get access to the database itself, they were in. SQL Server 2005 includes support for encrypting and decrypting data for storage using EncryptByCert. Let's look at how to implement this in the SqlCredit database. We will use it to encrypt the credit card number and security code that were previously stored in plain text.

SQL 2005 Books On Line Loading Options
Don Hite
Sometimes you may find that when you are in the SQL Management Studio or creating SQL queries that when you select Help your machine may seemingly take a long time to initialize the Help system or the searches take an inordinate amount of time to load and compete. In SQL 2005 this is not a reflection on the performance of SQL or your local machine it is usually the internet connectivity that is seemingly slowing you down and causing your coffee to turn cold in its pretty pale Styrofoam cup.

SQL SERVER - 2005 Row Overflow Data Explanation
Pinal Dave
In SQL Server 2000 and SQL Server 2005 a table can have a maximum of 8060 bytes per row. One of my fellow DBA said believed that SQL Server 2000 had that restriction but SQL Server 2005 does not have that restriction and it can have row of 2GB. I totally agreed with me but after discussed in depth we realized that there are more than only 8060 bytes limit.

SQL SERVER - Comparison Index Fragmentation, Index De-Fragmentation, Index Rebuild - SQL SERVER 2000 and SQL SERVER 2005
Pinal Dave
Index Fragmentation: When a page of data fills to 100 percent and more data must be added to it, a page split occurs. To make room for the new data, SQL Server must move half of the data from the full page to a new page. The new page that is created is created after all the pages in database. Therefore, instead of going right from one page to the next when looking for data, SQL Server has to go one page to another page around the database looking for the next page it needs. This is Index Fragmentation. Severity of the Index fragmentation can be determined by querying sys.DM_DB_INDEX_PHYSICAL_STATS.

SQL Server Express 2005 is now supported by XPe and WEPOS platforms
SavithaP
We have received numerous customer requests for supporting SQL Server 2005 on XPe and WEPOS. Listening to your requests, we are now happy to announce the support for SQL Server 2005 Express Edition starting with Service Pack 2 on Windows XP Embedded Service Pack 2 Feature Pack 2007 (XPe) and Windows Embedded for Point of Service SP2 (WEPOS). Here is additional information for your reference:

Pop Rivett and the Expanding Log
Pop Rivett
"Aargh! Quick Pop, we need help! The website's down, we can't trade, we're losing business and the browser page shows nothing but some gobbledygook saying 'Can't allocate space for object syslogs in database Snibbo because the logsegment is full'. What should I do!?"

Summer SQL Teaser #4 Nulls and Counts
Denis Gobo
For some more NULL summer fun you can read NULL Trouble In SQL Server Land. First create this table

Working days calendar in T-SQL
Joe Celko
SQL Apprentice Question: My company working days is from Monday to Friday. I would like to generate a result set which show all consecutive working days (Monday to Friday excluding the weekends) that is, for June 2007, it show me 1st, 4th, 5th, 6th, 8th, 11th etc... until 29th (last working day of June). Is it possible to do this with T-SQL?My company working days is from Monday to Friday. I would like to generate a result set which show all consecutive working days (Monday to Friday excluding the weekends) that is, for June 2007, it show me 1st, 4th, 5th, 6th, 8th, 11th etc... until 29th (last working day of June). Is it possible to do this with T-SQL?

I have problem with union
Joe Celko
SQL Apprentice Question: use Sql Server 2000. I've structure like it

On Metadata-Driven Analysis Management Objects (Part 3)
Adrian Downes
Enhancements to the CreateNewAggregationDesign Function: Most BI practitioners will be taking full advantage of multiple-measure group support in SSAS 2005 cubes (UDMs) for a variety of reasons including support for one of the many-to-many dimension design patterns eloquently positioned by Marco Russo in his work "The Many to Many Revolution". I have two successful implementations which build on his great ideas, and, it was during this phase of development that I came across an opportunity to enhance our code base for our modest Metadata-Driven-AMO "quiver of arrows" (to quote a good friend of mine).

SSIS Design Pattern: Using The DontSaveSensitive Package ProtectionLevel Property
Andy
I've read several MSDN SSIS Forum posts recently on the topic on configuration and deployment management. I get these questions in most of my SSIS classes as well.

SSIS Design Pattern - Dynamic SQL
Andy
I sometimes miss the ActiveX task in DTS. It was cool because "if you could see it, you could change it." I used the ActiveX Task to generate dynamic SQL.

SSIS Design Pattern - EncryptSensitiveWithPassword
Andy
Deploying SQL Server Integration Services packages is one of the steeper paths on the slopes of the SSIS learning curve.

SSIS Design Pattern - Custom Error Handling
Andy
I was working on error handling with SSIS recently and came up with what I believe is a snappy way to address it. The solution presented below is partial. For one, I would encourage you to use a database table to house errors; for another, I would encourage you to store lots more data than merely the ErrorDescription field.

SSIS Design Pattern - Incremental Loads
Andy
Loading data from a data source to SQL Server is a common task. It's used in Data Warehousing, but increasingly data is being staged in SQL Server for non-Business-Intelligence purposes.

Andy's SQL Server 2008 June CTP Risk-Averse Installation
Andy
So, you have the SQL Server 2008 June CTP bits and you cannot wait to install them and start tinkering! Spend some time first with the ReadMe files - please.

Advanced Ranking and Dynamically-Generated Named Sets in MDX
Chris Webb
Calculating ranks is a classic MDX problem, and as is usually the case Mosha has an excellent blog entry on the subject which I would encourage you to read if you haven't done so already:

SQL Server 2005 I/O Capacity Issues
Glenn Berry
Most large scale SQL Server 2005 deployments sooner or later run into I/O bottlenecks. There are several reasons for this. First, engineers often just think about CPU and RAM when sizing "big" database servers, and second, many DBA's are not able to completely tune the workload to minimize excessive I/O. Finally, there are often budgetary issues that prevent the acquisition of enough I/O capacity to support a large workload.

SQL Server 2005 Express Edition - Part 1
Marcin Policht
In order to address the diverse needs of its customers as well as to provide balance between offered feature sets, hardware/software footprints, and pricing, Microsoft has been consistently diversifying its database management portfolio, delivering several options with each new release of its SQL Server product line, with their own, distinct characteristics targeting specific market niches. In the case of SQL Server 2005, the list of available choices consists of the following:

SQL server 2008 (Katmai) Declarative Management Framework (DMgF) - enforce in multi-server environment
Satya SK Jayanty
SQL Server 2008 brought new feature that is Declarative Management Framework (DMgF, we cannot name it as DMF as it stands Dynamic Management Functions from SQL 2005). This is a policy based system to manage a SQL Server instance, using SSMS you can create policies to manage entities on that SQL instance. This is more useful or power to DBA for an effective management of server instance, databases and objects too. The 3 main components for this DM framework are Policy Management that is helpful to create policies by setting a condition and a target set that can be a server or database or database object. Explicit administration to select one or more managed targets,to check whether that comply with a specific policy or set of policies. Lastly, Automated Administration for execution by using modes:

SSIS and Disaster Recovery Planning in multi-user environment
Satya SK Jayanty
I have had an interesting discussion in SSP World about how to grant access & privileges to handle SSIS packages in a high availability and Disaster Recovery scenario. As you may be aware that privilege involves having access to Distributed COM components as well including MsDtsServer.

How to restore from a transaction log
Greg Robidoux
Having a backup plan in place is half the battle in keeping your SQL Server up and running. The other half involves the restore process. You need to restore SQL Server after a failure (when you need to restore to a standby server) or when refreshing a reporting or development environment. Each of these scenarios can involve restoring transaction log backups.

Do I have to be admin?
Michael Entin
Often, I see people reporting "zzz fails if I'm not admin". Do I have to be machine admin to use SSIS? Well, absolutely no. Most SSIS tasks can be performed without being admin at all. Actually, I develop on Vista machine and I very rarely need to elevate to admin status. Then why would you get errors if you are not admins? Most often you don't have access to whatever data source or package location you use. E.g. if you load a package from SQL Server, you need to have access to this SQL Server. Often people automatically get access because they are local admins, but for other users, access should be configured explicitly.

Time dimension - What should the key be?
Simon Sabin
We are currently implementing a data warehouse and the design of the time dimension was recently being done.

Distinct Counts in Analysis Services 2005
William Pearson
In a couple of earlier articles of this series, Introduction to MSSQL Server 2000 Analysis Services: Distinct Count Basics: Two Perspectives and Introduction to MSSQL Server 2000 Analysis Services: Manage Distinct Count with a Virtual Cube, I introduced the general concept of distinct counts, discussing why they are useful (and often required) within the design of any robust analysis effort. In these and other articles, I described some of the challenges that were inherent in their use in Analysis Services 2000, before undertaking practice exercises to illustrate solutions to meet example business requirements.

Creating Array or Table Parameters for a Stored Procedure
Jeff Smith
SQL is a set-based language, and often we wish to pass sets of data as a parameter to a stored procedure. For example, you might wish to pass a set of Customers to an invoice generating stored procedure, or a set of employees which for you'd like to calculate a particular bonus. Sometimes we need even more complex data than simple lists; you might have a time sheet application that lets a user create or edit a weekly time sheet at the client, and then you need to accept and validate an entire day of data when submitted.

Using Dynamically-Generated Sets in the MDX Script
Chris Webb
Even more fun: I've just found that you can use the technique for dynamically generating sets within a cube's MDX Script. For example, add the following set declarations to the end of the Adventure Works cube's MDX Script:

Install a self-signed test certificate that can be loaded by SQL Server automatically
Xinwei Hong
In the blog Certificate for SQL Server 2005, I explained the requirements of a certificate for SQL Server 2005. Customer usually needs to purchase a certificate from certificate authorities (such as VeriSign), or apply a test certificate from some of the free providers available from Internet. The process is usually complicated and time-consuming. If you just want a certificate for testing purpose, there is an easy way to get one. You can use the makecert tool to generate a self-signed certificate for your own use. The following command would do the work:

HOWTO: Pending Changes for a Merge Subscription (Server Side)
Laxmi Narsimha Rao ORUGANTI
Earlier in the other blog article we discussed on "HOWTO: Pending Changes for a Merge Subscription (Client Side)" and this article will focus on finding the pending changes on server side for download. This blog article requires a little understanding of Remote Data Access (RDA). Note that, RDA does NOT need any extra configuration and can use/piggyback on the current merge replication configuration both on IIS and in SQL Server.

Merge Subscribe Once, Copy and Deploy Multiple Times
Laxmi Narsimha Rao ORUGANTI
"Merge Subscribe Once, Copy and Deploy Multiple Times" - This has been the real good feature I like and unfortunately not many people knew about it. This is applicable for v3.0 and above. Here is a short story of it:

Debugging Stored Procedures in Visual Studio 2005
Chris Rock
This article walks through the basics of debugging stored procedures using Visual Studio 2005. It covers breakpoints, watches and stepping through code. It was written by Chris Rock and includes his very unique sense of humor.

Event Handlers
Kirk Haselden
Way back, years now, I sat down with Euan Garden in his office for several days for literally hours and started talking about all the stuff we were going to do with DTS. Euan is a font of knowledge and those few days were like drinking from the proverbial fire hydrant. We talked about how we can make DTS better. We talked about all the problems and pain points people experienced with DTS. How much trouble it was to debug DTS. Loops and connections that transparently synchronize workflow. So many ideas flowed from those initial discussions, it's hard to qualify or quantify their importance in the design of what is now known as Integration Services. One of the problems we discussed was how to handle errors specifically and events generally inside the package. Well, Euan moved on to bigger and OK things. :) The DTS team continued thinking about this problem. We already had a workflow system, why not handle workflow events with workflow. This idea grew into what is now known as event handlers.

XML Jumpstart Workbench
Robyn Page and Phil Factor
Some of the frustration of learning XML is in not being able to see clearly an entire process. So many of the existing manuals concentrate on just one step in a chain and leave the reader saying 'very technological. So what?'

Optimize disk configuration in SQL Server
Denny Cherry
One of the easiest ways to improve the lifetime performance of a SQL Server database is proper setup of the physical and logical drives. While it's an easy technique, proper disk subsystem configuration is often overlooked or handled by a member of the IT staff other than a SQL Server DBA. All too often the disk subsystem, including the disk array, is configured based only on storage capacity, with no question of drive performance. Let's go beyond storage capacity requirements and design for drive performance.

Foreign Keys and their States
Andr?s Belokosztolszki
Foreign keys enforce referential integrity and in this way contribute to the database's consistency. However, sometimes it is necessary to disable them temporarily. I have seen some confusion as to how one re-enables these disabled constraints in such a way that the database's consistency remains unaffected. So, in this article, I will examine some rarely explored areas that concern foreign keys; in particular, I will look at disabled and un-trusted foreign keys.

Ten Mistakes
Peter Sterpe
That Send Development Projects Off Track Peter Sterpe It's no secret that far too many software development projects end in abject failure. Whether it's a simple internal application or a massive, well-documented boondoggle like the FAA's disastrous Air Traffic Control system update, there are a lot of reasons that good software concepts can go bad.

The Microsoft SQL Server support policy for Microsoft Clustering
Microsoft
This article describes the Microsoft support policy for SQL Server failover clustering. Microsoft Product Support Services (PSS) supports SQL Server failover clustering that is based on the failover clustering features of the Microsoft Cluster Service in the following products:

Rico Mariani's DLinq Performance Tips (Part 2) and Compiled Queries
Roger Jennings
Rico Mariani continues his analysis of LINQ to SQL performance issues with a new DLinq (Linq to SQL) Performance (Part 2) post dated June 25, 2007 by recommending compiled LINQ query expressions.

Blobs block online indexing
Simon Sabin
If you are lucky enough to be running enterprise edition of SQL you may have looked or want to look into online index rebuilds. Unfortunately having blobs in your index blocks you from being able to do online rebuild of that index. Cruically what that means is that if your table is clustered then you cannot rebuild the clustered index on that table

Slicing Books Online
Greg Buck Woody
With all of the content in Books Online arranged into a single table-of-contents, it's sometimes difficult to find information that "spans" topics. For instance, if you want to find out about general security topics, you'll find they are spread across many sections of Books Online. Although we try to make this layout as clear as possible, sometimes information fragmentation is unavoidable. Here's how I deal with it, at least on the DExplorer connected client. (I use a different strategy on the web, one that I'll blog about later).

The Bounding Box, corrected version
Hugo Kornelis
In a previous posting, I explained a minor problem in the "Bounding Box" and "Dynamic Bounding Box" algorithms I describe in chapter 9 of Adam Machanic's book Expert SQL Server 2005 Development. In short, the problem is that I calculated the boundaries of the search area just a tiny bit too small, introducing a (very small) chance of relevant data not being found. I also presented a quick and dirty fix that basically works by deliberately making the search area too big, sacrificing (a small bit of) performance for correctness. And I promised to keep trying to find a better solution.

Table Valued Parameters in SQL Server 2008
Glenn Berry
One great new feature in SQL Server 2008 is Table Valued Parameters (TVP). They allow you to easily solve a common and difficult scenario where you want to insert or update multiple rows in a table without making multiple database calls or using somewhat nasty methods like large IN clauses.

SQL SERVER - Running Batch File Using T-SQL - xp_cmdshell bat file
Pinal Dave
In last month I received few emails emails regarding SQL SERVER - Enable xp_cmdshell using sp_configure. The questions are i) What is the usage of xp_cmdshell and 2) How to execute BAT file using T-SQL? I really like the follow up questions of my posts/articles. Answer is xp_cmdshell can execute shell/system command, which includes batch file.

Sorting data with row size greater than 8094 works in SQL Server 2005
Mladen Prajdic
When selecting data from a table we can select rows which contain more than 8094 bytes. The problem arises when trying to sort those rows.

Standard Reports: Basics for Business Users
Warren Thornthwaite & Joy Mundy
How to plan, prioritize and design the primary vehicle for delivering business intelligence. Here's how to plan, prioritize and design standard BI application reports. Business people should be eager to dive in and explore the data that represents their business. After all, who knows better what information is needed? Unfortunately, few business people seem to agree. Consider yourself lucky if 10 percent of your users actually build their own reports from scratch.

Building and Delivering BI Reports
Warren Thornthwaite
In our last column, we laid out a process for identifying, prioritizing and specifying the core set of business intelligence (BI) applications (see "Standard Reports: Basics for Business Users"). In this column, we describe the BI application development tasks that begin much later in the lifecycle, once real data is available in its final dimensional structure and the BI tool has been selected. As always, our process starts with a bit of planning and preparation. We then create the initial set of standard reports, based on the prioritized list from the design phase. The next step is testing and deployment, which leads to a discussion about ongoing maintenance and extending the standard reporting system.

Kimball University: Data Stewardship 101: First Step to Quality and Consistency
Bob Becker
Data stewards are the liaisons between business users and the data warehouse team, and they ensure consistent, accurate, well-documented and timely insight on resources and requirements.

Kimball University: Integration for Real People
Ralph Kimball
These step-by-step guidelines will help dimension managers and users drill across disparate databases.

Kimball University: Overcoming Obstacles When Gathering Business Requirements
Margy Ross
How do you cope with "abused users, overbooked users, comatose users, clueless users" and "know-it-all users" during the requirements-gathering stage of a data warehouse/BI project? Kimball group offers its advice for proactively working with (or around) the uncooperative, unavailable, uninsightful and irrepressible types who sometimes make it hard to know just what the business needs.

Kimball University: Four Fixes Refurbish Legacy Data Warehouses
Margy Ross
DW/BI professionals are often tasked with making evolutionary upgrades and improvements to minimize cost and upheaval in the current analytic environment. We explore four upgrades that can breathe new life into legacy data warehouses.

Kimball University: Building a Foundation for Smart Applications
Joy Mundy
Off-the-shelf apps may offer built-in analytics, but the best approach to supporting operational decisions is to rely on a solid data warehouse that cleans, integrates.

Alan Alda's Interviewing Tips for Uncovering Business Requirements
Margy Ross and Ralph Kimball
Good listening and conversational skills will uncover hidden needs and 'shadow functions.'

Slowly Changing Dimensions Are Not Always as Easy as 1, 2, 3
Margy Ross and Ralph Kimball
The three fundamental techniques for changing dimension attributes are just the beginning. How do you deal with changing dimensions? Hybrid approaches fill gaps left by the three fundamental techniques

Don't Forget the Owners Manual
Joy Mundy
Be sure to develop a DW/BI operations plan before deployment. Ensure trouble-free data warehouse and BI deployments by answering critical planning questions.

Beware the Objection Removers
Ralph Kimball
Is that sales pitch flying in the face of conventional wisdom? Start asking questions now.

Get Started With Data Mining Now
Warren Thornthwaite
Are you missing potential business opportunities because you're not exploring your data?

The Matrix: Revisited
Margy Ross
This Swiss Army knife for BI and data warehousing supports planning, integration and stewardship

Differences of Opinion
Margy Ross , Ralph Kimball
Comparing the dominant approaches to enterprise data warehousing. The Kimball bus architecture and the Corporate Information Factory: What are the fundamental differences?

Dashboard Confusion
Stephen Few
A clear understanding of dashboards requires delving beneath the marketing hype. Delve beneath the marketing hype if you want to gain a clear understanding of BI dashboards.

Fables and Facts
Margy Ross & Ralph Kimball
Do you know the difference between dimensional modeling truth and fiction?

RFID Tags and Smart Dust
Ralph Kimball
RFID tagging will create not just a tidal wave of data, but lifetime employment for data warehouse designers

Real Time: Get Real
Neil Raden
Take the idea of a real-time data warehouse with a grain of salt, then realize the possibilities

Real Time: Get Real, Part II
Neil Raden
Start by discarding your current concepts of ETL. Making the assumption that the primary database for the real-time data warehouse (RTDW) is a relational database, updating one in real time doesn't, on the face of it, seem like a big problem. After all, relational databases have been tuned for OLTP for years, allowing tens of thousands of updates per second. How could this be a problem for a data warehouse?

Better Storytelling
Jim Stagnitto
Dimensional design techniques bind events into stories. Storytelling is a powerful means of sharing knowledge because it simultaneously and elegantly satisfies several innate human needs, including the need for entertainment, sharing the accumulated wisdom of experience, and providing avenues of self-expression. How, you may ask, does storytelling relate to the job of a data warehouse designer? Design techniques can amplify the dimensional data warehouse's ability to tell the stories encased within the millions or billions of events sitting within its subject areas.

The Bottom-Up Misnomer
Margy Ross
Our data-warehousing approach is sometimes referred to as bottom-up, but it's far from it. The Kimball methodology of building a data warehouse is often called a bottom-up approach. This label, along with its associated connotations, is misleading, and misunderstandings about our approach are proliferating. It's time to set the record straight: Although our iterative development and deployment techniques may superficially suggest a bottom-up methodology, a closer look reveals a broader enterprise perspective.

Fistful of Flaws
Margy Ross & Ralph Kimball
Use this checklist to review your dimensional models. People often engage us to conduct dimensional model design reviews. In this column, derived from The Data Warehouse Toolkit, 2nd Edition (Wiley, 2002), we'll provide a laundry list of common design flaws to scout for when performing a review. We encourage you to use this list to critically review your own draft schemas in search of potential improvements.

No Detail Too Small
Margy Ross & Ralph Kimball
Although there's no substitute for atomic details, look into complementary consolidations. Atomic fact tables are the core foundation of any analytic environment. Business analysts thrive on atomic details because they can be easily rolled up "any which way" by grouping on one or more dimension attributes. The robust dimensionality of atomic data is extremely powerful as it supports a nearly endless combination of inquiries. However, business analysts can't always live happily ever after on atomic details alone.

Fact Tables and Dimension Tables
Ralph Kimball
The logical foundation of dimensional modeling. Dimensional modeling is a design discipline that straddles the formal relational model and the engineering realities of text and number data. Compared to entity/relation modeling, it's less rigorous (allowing the designer more discretion in organizing the tables) but more practical because it accommodates database complexity and improves performance. Contrasted with other modeling disciplines, dimensional modeling has developed an extensive portfolio of techniques for handling real-world situations.

Declaring the Grain
Ralph Kimball
It's the most important dimensional design step after identifying data sources. In debugging literally thousands of dimensional designs from my students over the years, I have found that the most frequent design error by far is not declaring the grain of the fact table at the beginning of the design process. If the grain isn't clearly defined, the whole design rests on quicksand. Discussions about candidate dimensions go around in circles, and rogue facts that introduce application errors sneak into the design.

The Soul of the Data Warehouse, Part One: Drilling Down
Ralph Kimball
Drilling down just means "show me more detail". Although data warehouses come in many shapes and sizes and deal with many different subject areas, every data warehouse must embody a few fundamental themes. The three most important are drilling down, drilling across, and handling time. Modern data warehouses so deeply embed these three themes that I think an "if-and-only-if" relationship has developed between them and a real data warehouse. If a system supports drilling down, drilling across, and handling time, then as long as it's easy to use and runs fast, it automatically qualifies as a data warehouse. But as simple as these three themes might seem, they give rise to a set of detailed and powerful architectural guidelines that should not be compromised.

The Soul of the Data Warehouse, Part Two: Drilling Across
Ralph Kimball
Drilling across means asking for the same row headers from another fact table. The three fundamental themes that make up the soul of every data warehouse are drilling down, drilling across, and handling time. In Part One of "The Soul of the Data Warehouse," I showed that drilling down was nothing more than adding a row header, any row header, to an existing query. Although we often grouse about SQL's limitations as a report writer, when it comes to drilling down, SQL gracefully expands an existing query whenever a new row header is added. This simple result led to the powerful realization that when data is organized in a symmetrical, predictable fashion starting at the most atomic level, all queries and applications benefit.

The Soul of the Data Warehouse, Part 3: Handling Time
Ralph Kimball
The data warehouse takes a pledge to preserve history. The three most fundamental maneuvers in every data warehouse are drilling down, drilling across, and handling time. I discussed the first two maneuvers in the previous two installments of this Fundamentals series. (See Resources.) The third, handling time, makes good on a pledge that every data warehouse provider implicitly takes: The data warehouse shall preserve history. In practice, this pledge generates three main requirements for the data warehouse:

Lost, Shrunken, And Collapsed
Lawrence Corr
Why and how to create the three types of aggregates in a dimensional data warehouse. We do best to think of aggregates tables as summary indexes, as I explained in my previous guest column ("Aggregate Improvements," Oct. 4, 2001). By doing so, we recognize that pre-aggregation is essential, as are other indexing techniques, to data warehouse performance. Also, aggregates should be silently utilized just as other indexes are, via a form of query optimization known as aggregate navigation. My previous column also discussed how to select the right size and number of these all-important summary indexes.

The Anti-Architect
Ralph Kimball
How not to design and roll out a data warehouse. Data warehousing is interesting because it involves so many different kinds of businesses and because the responsibility is so central to the mission of IT. But, as important as this job is, I have often felt overwhelmed when I listen to someone explain all the data warehouse manager's responsibilities. Be responsive to the business. Be responsive to the end users. Use technology wisely. Don't forget anything. Deliver results on time. Be courteous, kind, thrifty, brave, clean, and reverent (like a Boy Scout).

Realtime Partitions
Ralph Kimball
Build a special extension of each fact table to complement your static data warehouse. In the past two years, a major new requirement has been added to the data warehouse designer's mental list. The data warehouse now must seamlessly extend its existing historical time series right up to the current instant. If the customer has placed an order in the last hour, you need to see this order in the context of the entire customer relationship. Furthermore, you need to track the hourly status of this most current order as it changes during the day.

Doing the Work at Extract Time
Ralph Kimball
Countering the minimalist approach. Your mission as data warehouse designers is to publish your data most effectively. Fulfilling this mission means placing the data in the format and framework that is easiest for end users and application developers to use. Perhaps an apt analogy is a well-equipped kitchen. Ideally all the ingredients and tools needed to cook a meal are within easy reach. The ingredients are exactly what the recipe calls for, and the tools are meant for the job. Great cooks are notorious for using elaborate procedures out of view in the kitchen to produce just the right effect when the plate is served; so by analogy the back-room "chefs" in the data warehouse should do much the same.

Real-Life Data Mart Processing
Gabriel Tanase
Is your mart the symmetric "information diamond" at the end of the data pipeline? The conventional vision of a data mart is of a cleansed, historical, queryable data source for reports and ad hoc queries. And, of course, these reports are used directly for business decisions. Rarely are there any further feeds to downstream applications. Our vision also assures us that ad hoc analysis is possible by using any combination of available dimensions joined to one or a few fact tables. We often gravitate to a fundamentally symmetric star schema logical design where all the atomic measures can be meaningfully aggregated under any combination of dimensions. Any way you look at this information diamond, it looks clear and shiny. If only real life were as simple and symmetric as a diamond....

Asymmetric Aggregation
Gabriel Tanase
What to do if your data mart isn't a shiny dimensional diamond. In last issue's column I tried, without sounding too pessimistic, to describe several real-world threads most data mart designers are likely to encounter that make the data mart more complex than the shiny dimensional "diamond" we all dream about. (See "Real-Life Data Mart Processing.") These threads include: three distinct usage styles (ad hoc querying, standard reporting, and using the data mart as an intermediate step in the delivery of data to a serious downstream application); the existence of complex nonlinear aggregated measures; and the conflict between source-aggregated and data mart-aggregated data that should produce the same results but don't.

Tricky Time Spans
Ralph Kimball
The time dimension isn't nearly as simple as it looks. In some ways the time dimension seems like the simplest dimension. At least, if you're modeling calendar days, the time dimension is small and predictable. You can make up a 10-year time dimension in an afternoon using a spreadsheet; that's only about 7,300 rows.

An Engineer's View
Ralph Kimball
It's worthwhile to remind ourselves why we build data warehouses the way we do. Based on reader feedback, I've decided to do something I haven't done for quite some time: go back to lay the foundation for what data warehouse designers do, and why they use certain techniques. Doing this also lets me restate the assumptions and techniques with the benefit of hindsight and more experience. I hope the results are tighter, more up to date, and more clearly worded.

What Changed?
Joe Caserta
Use a multivalued outrigger table to add expressive power to your dimensions. The dimensional data warehouse is a specially architected database that primarily consists of two types of tables: dimensions and facts. But crafting these tables is just the beginning of designing a complete data warehouse solution. Naturally, only the simplest warehouse solutions could be accomplished using just these primitive table types. Business requirements consistently force data warehouse designers to embellish on these basic table types to develop creative and effective business solutions.

Design Constraints and Unavoidable Realities
Ralph Kimball
No design problem in school was this hard. The basic mission of a data warehouse professional, as I described in my previous column ("An Engineer's View," July 26, 2002), is to publish the right data. Because I am an engineer and try to build practical things that work, I then asked: What does a good engineer do when confronted with the task of presenting an organization's data most effectively?

Two Powerful Ideas
Ralph Kimball
The foundations for modern data warehousing. There are two powerful ideas at the foundation of most successful data warehouses. First, separate your systems. Second, build stars and cubes. In my previous column, I described a complete spectrum of design constraints and unavoidable realities facing the data warehouse designer. This was such a daunting list that I worried that you would head for the door. But maybe what kept you reading this far was my promise to dig us out of the morass. This is where the two powerful ideas come in. Last time, I stated that the nonnegotiable constraints on a data warehouse design were end-user understandability and query execution speed. A complex, slow data warehouse is a failure no matter how elegant the rest of the design may be because the people for whom it was intended won't want to use it. All the rest of the constraints and unavoidable realities were pragmatic and honest admissions that the data warehouse design space is extremely complex. The source data, the source systems, the database technologies, and the business environments we must deal with are incredibly complicated. So, as good engineers, to dig our way out of the mire, we must decompose the problem into separate manageable parts and emphasize techniques that are predictable, reusable, and robust when our design environment is altered.

Relating to OLAP
Joy Mundy
OLAP and ROLAP are a continuum, not competitors. Virtually all data warehouses use a relational data store. As a relational designer until a few years ago, I assumed that online analytic processing (OLAP) was merely a technology for small-scale applications. I now believe that perception is outdated and will become only more so as OLAP servers evolve to be a major component of the data warehouse.

Divide and Conquer
Ralph Kimball
Build your data warehouse one piece at a time. Deciding whether to enforce common labels for disparate data sources across the enterprise is the $64,000 question in many data warehouses. This decision is one you should address at the very earliest stages of implementation, preferably before you have exposed any data sources to end users. You may remember that in my previous column ("Two Powerful Ideas," Sept. 17, 2002) I described two ideas that are the basis for data warehouse design: The first was about separating your systems logically, physically, and administratively into a backroom data staging area and a front-room data presentation area. The second was about building stars and cubes in the presentation area. Taking the engineer's perspective, I described many advantages gained from these ideas. At the end of the column, I left you with the thought that the symmetrical stars and cubes I designed in the presentation area gave us a set of predictable points of commonality for linking together data from across the enterprise.

Designing the Financial Data Warehouse
Dan Hughes
Now that the finance function has knocked on your door, what's next? Building a data warehouse to support financial decision-making can be one of the most challenging endeavors a data warehousing manager can undertake. Senior management visibility, political territory, domain expertise, complex transaction systems, and a need to understand the business at a cross-functional level are just a few of the obstacles. But working in this area can also be a boon to your career and very rewarding. In this column, I will describe the typical transaction system landscape supporting the finance function and review some limitations of a typical financial analysis model. I will then present an approach that can overcome these limitations.

The Promise of Decision Support
Bill Schmarzo
Analytic applications return to the original reason for the data warehouse. Many data warehouse implementations start by servicing the reporting needs of their business communities. They focus on providing a rearview mirror perspective on their business operations, but stop there and declare success. These projects fail to go beyond merely providing a bunch of prebuilt reports. Instead, they need to go further, to entwine analytic applications into the very fiber of an organization's decision-making processes.

Monitoring Changes to your Database Schema
Gregory A. Larsen
How can you monitor changes to your Database Schema? Why would you want to track changes to your database structure? You might want to track changes to your database data definitions for a number of different reasons. With the introduction of SQL Server 2005, Microsoft now allows you to write Data Definition Language (DDL) triggers to perform actions when events occur on your server. In this article, I will discuss some of the things that DDL triggers could be use for, as well as show you a few DDL trigger examples.

Skipping SQL Server Versions
Brian Moran
I wanted to follow up on a few topics related to last week's editorial in which I wondered whether customers might leapfrog from SQL Server 2000 straight to SQL Server 2008 (code-named Katmai). You can read last week's editorial "Leapfrogging to Katmai" and peruse the reader responses that have come in so far at http://www.sqlmag.com/Articles/ArticleID/96348/96348.html.

What is a page split? What happens? Why does it happen? Why worry?
Tony Rogerson
You've probably heard the term banded around but do you know what it means and what it means to the performance of your application? I'm going to demonstrate a page split, use DBCC PAGE to show what happens, SQL Profiler to show the locks and talk round what's going on.

Stay Away From Auto Increment Columns
Rafik Robeal
I had few conversations on the sync services forum about that ability to synchronize a table with auto-increment column, also known as identity column, as the primary key or unique id for the row. It is rather common to see this pattern of primary key set to auto increment integer type in every table and then establish foreign key relationships across different tables in the database. Unfortunately, auto-increment columns are not designed with sync in mind.

Understanding SQLIOSIM Output
Kevin
One of the things that's great about my job is the fact that I get to meet people who are a lot smarter than me. One such person I count among my friends is Jimmy May, a Microsoftee from the Indianapolis area. Jimmy has been using SQLIOSIM for a while, but has discovered that its reporting leaves a bit to be desired. Jimmy and I were discussing the reporting characteristics of SQLIOSIM and he thought it'd be useful to share his findings with the public. Thanks for sharing, Jimmy!

Summer SQL Teaser #5 Update
Denis Gobo
Try to guess the output if you would run this. I tested this on SQL Server 2000/2005 and 2008 (June CTP). I get the same output on all three servers

To DBA or not to DBA?
jidaw.com
Who is a DBA? Many people who are interested in IT careers often talk about becoming a DBA. This is no surprise as the demand for DBAs continues to rise. But while interest is quite high do many really know what being a DBA entails? To some it's just about getting a DBA certification. For most, it is simply a bad case of "joining the bandwagon".

What Is a DBA? - Part 4
Craig S. Mullins
There are DBAs who focus on logical design and DBAs who focus on physical design; DBAs who specialize in building systems and DBAs who specialize in maintaining and tuning systems; specialty DBAs and general-purpose DBAs. Truly, the job of DBA encompasses many roles. Some organizations choose to split DBA responsibilities into separate jobs. Of course, this occurs most frequently in larger organizations, because smaller organizations often cannot afford the luxury of having multiple, specialty DBAs. Still other companies simply hire DBAs to perform all of the tasks required to design, create, document, tune, and maintain the organization's data, databases, and database management systems. Let's look at some of the more common types of DBA.

MDX in Dashboards, Scorecards, and Views?
Mswarner
You can use MDX in Dashboard Designer in three areas: filters, KPI data mappings, and the Analytic View Designer. This capability allows to you do advanced selections and queries for OLAP data sources that may not be possible through the designer user interface.

Pull Merge Replication Security with Sql Server 2005 with no Domain Set Up
Jonas Stawski
I have been testing the merge replication from Sql Server 2005 for a client. The truth is that I was very impressed with it. This thing works very well (at least with the requirements we had) and it's all a matter of configuration. I had some difficulty setting up the subscribers running on each client (pull replication). The agents would constantly fail due to security reasons. After a lot of trial and error and finally reading about it here I was able to have a secured model and make replication work. I must say I didn't have a domain set up on neither the client nor the server and therefore made this much more difficult. Here are some of the errors I was getting:

Save documents with SQL Server and display them
Jonas Stawski
So you have the need to let the user upload documents of any type to your server, but you don't know how to implement it. Some people say store them in the DB, others hate that and store the files on the file system. Which way do you go? Well, that depends on the requirements and the way the application is designed. I really try to stay away from storing binary content on the DB, but sometimes it is the best approach. On my case I didn't want a user accessing the file in any other way than through the User Interface of the application. And since the web application was using Windows Authentication and impersonation if I would have stored the files on the file system I would have had to give access to those files to the users accesing the app, which in turn would have given them access to the users to access the files through the network. So DB was the answer to my problems.

SSAS: Exporting MDX and XMLA from SQL Server Profiler
Darren Gosbell
Hey, why didn't anyone tell me about this one? :) I was not looking for this, I was looking for someway to easily extract event durations without having to get my hands dirty with angle brackets (XML) and I stumbled upon the menu below.

Script Task Breakpoints in Katmai
Konesans Ltd
One of the main things coming from our look at SQL Server 2008 Integration Services is that you are not going to see a lot of change visually. What you are going to see is things going on underneath the covers that make Integrations services perform better along with better informational messages from the components. Just such a change is seen in the Script task. In SQL Server 2005 we can drop onto the designer two Script tasks and join them with workflow. We can then add a breakpoint to each of the Script tasks and the control flow will look like this:

SSL Cipher Suites used with SQL Server
Il-Sung Lee
When enabling channel encryption between the application and SQL Server, users may wonder what encryption algorithm is being used to protect their data. Unfortunately, this isn't an easy question to answer and here's why.

Login failed for user , Msg 18456, Level 14, State 1
MING LU
This post focus on a typical cause of sql login failure issue described as the title. In real world, a customer log on to the machine as a domain\user or local machine accout, he can connec to SQL Server through SQL Server Management Studio. However, he fails to connect by using SQL Login, namely, in connection string specify User as the NT account, and the password.

Beware SQL Injection
Glenn Berry
Microsoft's UK site was defaced on June 26 by hackers apparently from Saudi Arabia, using a SQL Injection attack. This comes only a few weeks after another Microsoft web site defacement by SQL Injection attack back in early May.

Has Microsoft the chances to be the leader in the Business Application world?
STEFANO DEMILIANI
With a bit of surprise, Roberto Andreoli today has published on his blog a post about one of the topic that I'll try to demonstrate during the next weeks for who has the fortune :P to be present at the next events organized by my company (I'll talk about these events later).

Enabling Database Mail on SQL Server Express
Mladen Prajdic
Database mail is a completly rewritten mailing system in SQL Server 2005 built on top of the service broker. This means that it runs asynchrounously. The mails are put in a queue and are sent from there. However it's not present in SQL Server Express. I wonder why not because Express supports Service Broker just fine. Well i've found a great blog post that explains how to "enable" Database Mail in SQL Server Express. But it's in German so i'll explain in English how to get it to work :) SQL Express holds all necessary stored procedures, service broker queues, etc... for proper mail handling. However when you try to send an email the processing of queued mails fails with because an external process called "DatabaseMail90.exe" couldn't be started. This means that the file is simply missing.

Did You Know? A 6th parameter is available when creating a trace
Kalen Delaney
Most people I know, as geeky as some of them are, prefer to create their server-side trace definitions by letting the Profiler GUI do the dirty work. It's so much nicer to just point and click your way to defining a set of trace events, data columns and filters, than having to write dozens or even hundreds of calls to sp_trace_setevent. ( The sp_trace_setevent procedure has to be called once for every combination of event and data column, so for even the smallest trace, with half a dozen events and half a dozen data columns, that would be 36 calls.)

How enabling DMF affects speed of table creation
Alexey Yeltsov
Perf hit is expected, but looks like currently it's quite a lot. Say I can create ~500 tables and views (250+250) per second without DMF. Once I enforce naming conventions (tbl% for tables and vw% for views), the rate of object creation falls to 3 objects per second. Seems like a pretty high tax. Also, database that does not subscribe to the policy is experiencing a tangible hit as well (500 -> 200). Is this expected? Has this been noticed and any conclusions drawn yet? Will there be improvements? Please let me know if I should file a connect bug. Results (policy is enabled for Database1 and not enabled for Database2):

Is it possible to restore all databases from Enterprise Edition to Standard Edition in SQL Server?
Satya SK Jayanty
Yes and No! Yes is you can restore user databases (only) from Standard Edition to Enterprise Edition and vice-versa for further workout. No is you cannot restore any system database including master from Standard Edition to Enterpris Edition and vice-versa. Take help from this KBA224071 for moving system databases. Additionally within this exercise of moving databases don't forget to take help of another KBA918922 to tackle the logins issues including passwords.

Restrict Access to SQL Server using "Certificates"
Satya SK Jayanty
Within SQL Server 2005 you could take help of certificates to restrict the access from a particular client's machine. Few times (even me) confused with or get an impression that SSL encryption is same, for more information on SSL(Secure Sockets Layer) refer to KBA316898 in this case. There is much information available within SQL 2005 BOL in this case as compared to previous versions of SQL Server.

Spatial Enabling Your Data Warehouse
Ralph Kimball
I have always been puzzled by the chasm separating the data warehouse community and the geographic information systems (GIS) community. Very few "conventional" data warehouses exploit their data with a map-driven approach, yet these same data warehouses are rich with geographic entities including addresses, point locations, sales districts, and higher level political geographies.

Smarter Data Warehouses
Joy Mundy
Few queries and reports perform any calculation fancier than summing or counting, with the occasional ratio thrown in for excitement. Query and reporting tool vendors have done an excellent job of making the simple things easy. They do a nice job of providing functionality that is missing from SQL, such as subtotals and market contributions. Some can perform more complex calculations, but doing so requires extracting data from the database onto a client desktop or middle-tier server.

Joint Effort
Ralph Kimball
Administer the distributed data warehouse correctly or lose it! As you may have noticed, my column's name has changed to "Data Warehouse Designer." This new name is more descriptive of the broader scope and more practical flavor of the subjects I am now covering here. I will continue to invite a number of experienced data warehouse practitioners to alternate with me in writing these columns. I hope you will enjoy some of their fresh perspectives. While we are on the subject of new things, this column's topic is about the most important new development in data warehousing: the distributed data warehouse (DDW). Inspired by the Web and the ability to share data through XML, a DDW arises whenever several organizations agree to share data.

The Unity Dimension
Nick Galemmo
A saner solution for when differing entites share a role. I recently designed an enterprise data warehouse for a large food manufacturer. As I began my analysis and executive interviews, I realized that my preconceptions of the necessary dimensions for the model were off target. Having worked exclusively for other types of manufacturers in the past, I was unaware of the significant role logistics plays in the order fulfillment cycle for food makers.

A Trio of Interesting Snowflakes
Ralph Kimball
Beat three common modeling challenges with extensions of the dimensional model. "When can I use a snowflake?" is a question data warehouse designers have asked me hundreds of times. I usually answer that it's a bad idea to expose the end users to a physical snowflake design, because it almost always compromises understandability and performance. But in certain situations a snowflake design is not only acceptable, but recommended.

Managing Helper Tables
Ralph Kimball
A careful look at many-to-many relationships between important dimensions. Multivalued dimensions are normally illegal in a dimensional design. We usually insist that when the grain of a fact table is declared, the only legal dimensions that can be attached to that fact table are those that take on a single value for that grain. For instance, in the banking world, if the grain of the fact table is Account by Month, then we exclude the Transaction dimension because it takes on many different values throughout the month. If we wanted to see individual transactions, then we would declare a finer grain, such as Account by Transaction by Time of Day.

Managing Your Parents
Ralph Kimball
The parent-child data relationship is one of the fundamental structures in the business world. An invoice (the parent), for instance, comprises many line items (the children). Other examples include orders, bills of lading, insurance policies, and retail sales tickets. Basically, any business document with an embedded repeating group qualifies as a parent-child application, especially when the embedded line items contain interesting numerical measurements such as dollars or physical units.

Catastrophic Failure
Ralph Kimball
Know the threats to your mission-critical data warehouse and how to defend against them. The tragic events of September 11 have made all of us reexamine our assumptions and priorities. We are forced to question our safety and security in ways that would have seemed unthinkable just weeks before.

Aggregate Improvements
Lawrence Corr
Are you taking full advantage of the ability to presummarize your data? "The single most dramatic way to affect performance in a large data warehouse is to provide a proper set of aggregate (summary) records ... in some cases speeding queries by a factor of 100 or even 1,000. No other means exist to harvest such spectacular gains."

What Not to Do
Ralph Kimball
Dimensional modeling mistakes to guard against. In nearly all the 85 columns I have written for Intelligent Enterprise and its predecessor, DBMS, I described design techniques and criteria needed to build some part of a data warehouse. But despite all those columns, something is missing. The tone of the columns has almost always been imperative: "In situation A, use design techniques X, Y, and Z." I realize that data warehouse designers also need boundaries. So this column is devoted to design techniques not to use. I have restricted myself to dimensional modeling design techniques; a future column will take a broader view of the entire data warehouse life cycle.

Clickstream Data Mart
Joe Caserta
It's part of the 360-degree customer view. It's being requested more. Are you ready? The clickstream's image is finally transforming, deservedly so, from that of a glorified "hit counter" to that of a crucial customer information source. Case in point: Over the past few months, many people from even non-dot-com corporate environments have asked me about clickstream data marts. Most of this change of perception probably owes to the endless effort businesses are exerting to represent all 360 degrees of the customer in their data warehouses. After all, the clickstream data mart is a strategy that allows companies to truly complete the customer profile, by providing a way to integrate all Web-based customer activity data with conventional transactions.

The Special Dimensions Of the Clickstream
Ralph Kimball
The most exciting new data source in the data webhouse is the clickstream: the river of clicks on our Web sites. The clickstream contains a record for every page request from every visitor to our site. In many ways, we can imagine that the clickstream is a record of every gesture each visitor makes, and we are beginning to realize that these gestures add up to descriptions of behavior we have never been able to see before.

Mystery Dimensions
Ralph Kimball
Often, we data warehouse architects drive a fact table's design from a specific data source. A typical complex example might be a set of records describing investment transactions. A recent example I studied had more than 50 fields in the raw data, and the end users assured me that all the data was relevant and valuable.

Indicators of Quality
Ralph Kimball
As data warehouse managers, we always have an eye on data quality. because our users trust us implicitly, we worry whether our data is accurate and complete. We occasionally have nightmares that the auditors will ask us how the data got into the database, and what exactly our assumptions were when we computed a certain number.

There Are No Guarantees
Ralph Kimball
As database designers, we talk frequently about business rules. Business rules, after all, are the heart and soul of our applications. If our systems obey the business rules, then the data will be correct, our applications will function, and our users and management will be happy.

Backward in Time
Ralph Kimball
Stitching late-arriving records into an existing data warehouse. As a data warehouse practitioner, you have been developing powerful techniques for accurately capturing the historical flow of data from your enterprises. Your numeric measurements go into fact tables, and you surround these fact tables with contemporary descriptions of what you know is true at the time of the measurements. You package these contemporary descriptions as dimension tables in your dimensional schemas. The dimension tables representing the customer, the product, and the store must all contain the correct descriptions of these entities at the time of the purchase. That is why you call the descriptions "contemporary."

The Keyword Dimension
Ralph Kimball
Using keywords in a data warehouse of historical letters. I recently had the opportunity to design a data warehouse for a massive archive of historical letters. The archive was fascinating, consisting of hundreds of thousands of letters from the 1500s through World War II. The archive mostly consisted of well-preserved envelopes, showing the origins and destinations of this "postal history." But perhaps 15 percent of the letters actually retained their original contents, ranging from simple business correspondence, to family letters, to mail from soldiers on the front lines.

Is Your Data Correct?
Ralph Kimball
Simple statistical techniques can help you ensure that users have accurate information at their fingertips. One common problem in the data warehouse back room is verifying that the data is correct before you release it to your users. Is the warehouse an accurate image of the production system? Was this morning's download complete? Could some of the numbers be corrupted?

Fact Tables for Text Document Searching
Ralph Kimball
Using a similarity metric as a measured fact. In the October 20, 2000 issue, I dipped my toe into the large lake of text string searching. I described two approaches for handling a list of keywords describing an archive of documents. But a keyword-based approach to accessing a large number of documents makes some strong assumptions. A really good set of keywords may require human reviewers, a fact that certainly restricts the scale of the document archive. There aren't enough human reviewers to index millions of documents. Some interesting progress has been made in automatically generating keywords for documents, much like you can automatically generate book indexes. But even if you have a pretty good set of keywords describing documents, your end users are still left with some tricky query issues.

Four Ways to Build a Data Warehouse
Wayne Eckerson
It has been said there are as many ways to build data warehouses as there are companies to build them. It has been said there are as many ways to build data warehouses as there are companies to build them. Each data warehouse is unique because it must adapt to the needs of business users in different functional areas, whose companies face different business conditions and competitive pressures. Nonetheless, four major approaches to building a data warehousing environment exist. These architectures are generally referred to as 1) top-down 2) bottom-up 3) hybrid, and 4) federated. Most organizations-wittingly or not-follow one or another of these approaches as a blueprint for development. Although we have been building data warehouses since the early 1990s, there is still a great deal of confusion about the similarities and differences among these architectures. This is especially true of the "top-down" and "bottom-up" approaches, which have existed the longest and occupy the polar ends of the development spectrum.

One Version of the Truth
Marianne Arents
Meta Data Management and Other Considerations for Enterprisewide Business Intelligence Implementations. This article identifies the considerations in managing the multiple data mart structures that span across the enterprise. In today's business environment, analysts throughout the enterprise are demanding more data availability, and are employing state-of-the-art BI tools to access that data. This requires the creation of numerous data marts, tailored to each functional area's specific perspective and data set. Analysts are also becoming more involved in the management of these data marts, since most of today's business intelligence (BI) tools offer a GUI interface to manage the mart, and the analysts know how the information should be structured. They are also hesitant to rely on an IS programmer or database administrator to manage the structure, because this often involves a change management process and more time. After all, the analysts maintained the web of spreadsheets that did the job long before the data mart existed, and executive management still expects the quick turnaround they were used to in the spreadsheet environment.

Compliance as a Trojan Horse: Funding Your Enterprise Data Warehouse
Stephen Brobst
Current IT spending in support of regulatory and compliance projects is significant, and it is expected to increase in coming years. Current IT spending in support of regulatory and compliance projects is significant, and it is expected to increase in coming years. Gartner figures indicate that 12 percent of current IT project budgeting is directly related to compliance, and that this figure will likely exceed 15 percent in 2007. At first glance, this may appear to be a huge burden on an organization's ability to innovate and provide competitive differentiation from within fixed (and often shrinking) IT budgets. However, innovative organizations are turning this apparent compliance liability into an asset through the leveraging of enterprise data warehousing infrastructure.

Retrieving Identity Values When Inserting Multiple Rows
Jeff Smith
Suppose you have the following tables:

Rico Mariani's DLinq Performance Tips (Part 2) and Compiled Queries
Roger Jennings
Rico Mariani continues his analysis of LINQ to SQL performance issues with a new DLinq (Linq to SQL) Performance (Part 2) post dated June 25, 2007 by recommending compiled LINQ query expressions.

Rico Mariani's DLinq (LINQ to SQL) Performance Tips (Round 3)
Roger Jennings
While waiting for Rico to drop his other shoe and provide some Beta 2 LINQ to SQL performance numbers, it occurred to me that increasing the size of the object property (field) values would have an deleterious effect on LINQ to SQL performance ratios.

Synchronization Complexity in the .NET Framework, Part 2
Rico Mariani
Well it seems like an eternity ago but at last I'm writing the followup to my initial question about synchronization complexity. I'd like to start with this link to a summary of the synchronization costs of nearly all of the framework. And I say nearly all because I noticed that at least three methods had a synchronization cost that overflowed a 64 bit signed integer. Assumming (and I think this is safe to do) that they only spilled into the negative regions, those three entries are

DLinq (Linq to SQL) Performance (Part 1)
Rico Mariani
I love Linq. Really. That might scare you because it has all these wacky new constructs and as a performance guy you'd think that I'd be all about getting rid of abstractions and just getting to the metal. But don't be scared, I haven't lost my mind. Linq is great because, even though it adds some levels of complexity, it simulataneously increases the chunkiness of the work that the framework receives in such a way that it creates fantastic opportunities to deliver performance. Just like SQL can do a great job optimizing database queries because they are chunky enough.

Caching Redux
Rico Mariani
I got some interesting questions about how to build good middle-tier caches in my inbox last week. I cleaned up the responses a little bit and I'm posting them here because they're actually pretty general. I've written about this before but some things merit repeating :)

DLinq (Linq to SQL) Performance (Part 2)
Rico Mariani
So after getting some high level times I started digging into the particulars of the costs more broadly and I ended up studying a very simple query like the below one.

DLinq (Linq to SQL) Performance (Part 3)
Rico Mariani
I'd like to start with a little housekeeping. Some readers asked me how I made the nifty table in part 2 that showed the costs broken down by major area. It was actually pretty easy to create that table using our profiler. I did 500 iterations of the test case in sampled mode and that gave me plenty of samples. I could see which callstacks ended in mscorjit.dll - even without symbols -- which I never bothered using -- that gave me a good idea how much jit time there was. I could see the data-fetching functions from the underlying provider being called -- the same ones that appear in the no-linq version of the code (GetInt32, GetString and so forth) so I knew what the costs of actually getting the data were. I could see the path that creates the expression tree for the query and I could see stub dispatch functions. So I added up the related ones, broke it into 5 categories and then showed one more line for the bits that didn't fit into any of those categories. Then I scaled the numbers up so that the part of the benchmark I cared about was 100% (there was other junk in my hardness not relevant to the benchmark). That's it :)

Data Warehousing ROI: Justifying and Assessing a Data Warehouse
Hugh Watson, Dominic Thomas, David Preston, Daniel Chen, Dorothea L. Abraham
Drawing on the data warehousing literature, survey data, theory, and eight case studies, seven justification and assessment propositions are presented. Data warehouses require a sizeable commitment of organizational resources. As a result, there is considerable interest in how they are initially justified and later assessed. Data warehousing costs are relatively easy to estimate, but the benefits are more difficult to evaluate. Survey data shows that most companies quantify the costs of data warehousing but not the benefits. In order to better understand how companies justify and assess data warehousing investments, eight case studies were conducted. Drawing on the data warehousing literature, survey data, theory, and the case studies, seven propositions are presented.

Ten Mistakes to Avoid When Estimating ROI for Business Intelligence
Evan Levy
The most common missteps managers make when financially justifying their business intelligence (BI) and data warehouse programs. In the late 1990s, rampant overspending on IT was standard practice for companies who were flush with funding and escalating equities. Today's more sober economic climate has changed everything. Now executives are being held accountable for new capital expenditures, sometimes to the penny. They're also being asked to justify existing IT projects that a few years ago seemed bulletproof. Managers who thought their CRM, supply chain management, and data warehouse initiatives were safe have had a rude awakening. They are being asked to justify these initiatives or risk losing them to newer technology programs. This has meant going back to the business case drawing board and calculating return on investment, often for the first time.

The Business Intelligence ROI Challenge: Putting It All Together
Bill Whittemore
Getting approval for your business intelligence (BI) or data warehouse project requires being able to demonstrate business value to your decision makers. Among the most difficult tasks is providing a predictable ROI to senior management. Calculating the ROI for data warehouse and business intelligence projects is a very complicated and perplexing task. Today's decision makers can no longer accept being unable to calculate the ROI for data warehousing and business intelligence projects. This article provides a practical framework and process for calculating ROI for data warehouse and business intelligence projects.

Derby Identity Weirdness
Ben Forta
Like many DBMSs, Apache Derby (included with ColdFusion 8) supports identity fields - fields that auto-increment each time a row is add. These are commonly used for primary key values, as the DBMS itself ensures that these values are unique and never reused. One problem with identity fields is that sometimes you may need to insert a row providing an explicit value to be used, essentially overriding identity functionality. Some DBMSs (like SQL Server) allow you to turn off identity processing with an explicit directive while a row is inserted and then turn it on again, and numbering automatically continues from the new highest value. Other DBMSs (like MySQL) allow you to simply insert values with specific values and, if present, these are used instead of auto generated values. Derby also supports identity fields. These can be defined as ALWAYS in which case Derby always generates the value (which can never be manually specified), or BY DEFAULT in which case identity fields are generated only if an explicit value is not provided. It's a rather nice implementation. But, it does not work as you'd expect. You can indeed specify an explicit value if BY DEFAULT is used, and your value will be used. But Derby does not seem to pay attention to explicitly provided values and does not update the internal counters accordingly, so when you next insert a row without an explicit value it may generate the exact same value as the one you specified. And if that column is a primary key, well, obviously the second INSERT is going to fail. There is a workaround. When the table is created you may specify an optional START WITH value. So, if you need to load the table with 25 rows you can set START WITH to 26, and that will be the starting point for generated identity values. I guess you could also START WITH some really high number, and reserve the lower values for when you needed to explicitly provide a value. Still, this is a hack, and it makes BY DEFAULT rather useless, which is a shame.

Instant gratification
Leo Pasta
I expected that my first "real" post would be about one of the features that had more appeal to me on SQL Server Katmai, but I will have to post about a discovery that was so useful to me, and yet so basic that I am almost ashamed to have found it only today. So let's put aside my desire to pretend I knew it for a long time and post it, after all I hope that there is at least a couple of DBAs who still don't know it:

End of Support for SQL Server 2000 Service Pack 3a
Matthias Berndt
This has been communicated repeatedly through Account Manager and Premier channels, just wanted to use this forum to repeat one more time through the formal statement below:

Save and Restore Files/Images to SQL Server Database
Mitch Wheat
The usual consensus is that it is bad idea to store large files or images in a database, and it is recommended that you instead store the file system path to the image in the database and store each image as a separate file in a suitable folder structure. This helps to reduce the size of the database and keeps it performing efficiently. Despite the disadvantages, there are some advantages in storing images in the database such as keeping all data in a single place.

SQL Server: Fastest Way to Create a T-SQL datetime?
Mitch Wheat
Saw this technique to create datetime values over at SQLTeam:

MICROSOFT ACQUIRES DUNDAS TECHNOLOGY FOR SQL SERVER 2008
Ed Worsfold
"Dundas is the charting technology of Reporting Services". ORLANDO, FL (June 4, 2007) - Dundas Data Visualization, a leader in the advanced data visualization market, today announced Microsoft Corp. has acquired Dundas's data visualization technology, which will be part of the final feature set for Microsoft SQL Server 2008 Reporting Services.

Microsoft Windows PowerShell and SQL Server 2005 SMO - Part 3
Muthusamy Anantha Kumar
Part I and Part II of this series discussed PowerShell installation and simple SMO, WMI cmdlets. Part 3 of this series illustrates how to script PowerShell cmdlets and execute them. Scripting is essential for automation and executing repetitive tasks.

Data Types -- The Easiest Part of Database Design
Jeff Smith
Database design can be very complicated, and it truly is an art as opposed to a science; sometimes there are multiple correct ways to model the same data with pros and cons to each. I can understand that normalization can be tricky to comprehend and to implement, and that concepts like stored procedures and foreign keys and even indexes and constraints can take time to grasp.

Surrogate key issues with Analysis Services
Marco Russo
Usually I don't post something just to raise a question - but every rule has its exceptions. As you know, using surrogate keys is a best practice for a lot of reasons. Everything works fine with SSAS when you use surrogate keys with a plain star schema. If you have an incremental process of the dimensions in the Data Warehouse, you are also granted that surrogate keys don't change their internal meaning during time. But, are you sure you want to use surrogate keys in end user reporting tool?

Visual Studio Team System Future Releases
Microsoft
In addition to developer productivity improvements and new supported platforms and technologies in Visual Studio, Visual Studio code-named "Orcas" release will include the following new scenarios, features and improvements to Visual Studio Team System.

Which 3rd party databases to support?
Gert Drapers
As you might know the team is working very hard on the next release of Visual Studio Team Edition for Database Professionals, which is be part of upcoming the Visual Studio Team System "Rosario" release. One of the things we are doing in the Rosario timeframe, is converting our current implementation to a provider based model. In Rosario we are introducing the concept of Database Schema Providers (DSP's for short), which will allow partners to plug in support for 3rd party database in to VSDBPro. The VSDBPro team will deliver providers for Microsoft SQL Server platform (2000, 2005, 2008 and Compact Edition).

SSIS Nugget: Output the number of processed rows
Jamie Thomson
Here's a handy little tip for when you are building custom destination components or script destination components that enables you to log the number of processed rows.

Copying the Favorites from Books Online
Greg Buck Woody
I get asked a lot if it is possible to copy the Favorites from Books Online to another Favorites folder on another computer. The answer is yes! The Help Favorites are stored by default in: C:\Documents and Settings\username\Application Data\Microsoft\Microsoft SQL Server\90\Tools\Shell\VS Help Data\Favorites.xml. Just copy this file to the equivalent location on another machine. Note: This will overwrite Help Favorites on the second machine. You could combine Help Favorites by opening both files and copying the source from one file to another in a logical way that follows the XML schema. Help Favorites for Business Intelligence Development Studio and Visual Studio are stored in: C:\Documents and Settings\usernameApplication Data\Microsoft\VisualStudio\8.0\VS Help Data\Favorites.xml. The same process and warnings applies for this file.

The PIVOT Operator
Craig Freedman
In my next few posts, I'm going to look at how SQL Server implements the PIVOT and UNPIVOT operators. Let's begin with the PIVOT operator. The PIVOT operator takes a normalized table and transforms it into a new table where the columns of the new table are derived from the values in the original table. For example, suppose we want to store annual sales data by employee. We might create a schema such as the following:

Detecting Overlapping Indexes in SQL Server 2005
mssqlisv
When SQL Server has an optimal index that satisfies the search predicates of a query the optimizer performs an index SEEK operation as opposed to an index (or table) scan to retrieve the required rows; this is desirable. Based on this, one may be led to believe that having an index for every possible query predicate set would result in all the queries executing optimally. While true, one has to keep in mind that the indexes need to be maintained when the underlying table data in the column included in the index changes, which amounts to overhead for the database engine. So as you may guess, there are advantages of having indexes, but having too many can result in excessive overhead. This implies that you need to carefully evaluate the pros and cons before creating indexes.

How to install SQL Server 2005 Reporting Services on a Windows Vista-based computer
Teo Lachev
I spent a couple of hours today trying to get SSRS 2005 working on Vista. I was getting the infamous IIS 500 error when browsing the Report Manager (http://localhost/reports) and Report Server (http://localhost/reportserver) virtual roots. I triple-verified the Brian Welcker's recommendations and the Vista considerations in the SQL Server 2005 readme file. I couldn't figure what's going on especially given the fact that I installed SSRS on my home Vista machine with no problems. The only difference this time was that I performed file-only install of SSRS because I wanted to use an existing report catalog. In a moment of a Google eureka, I came across the How to install SQL Server 2005 Reporting Services on a Windows Vista-based computer KB article. It turned out that the Reporting Services Configuration Utility put the IIS applications in the default application pool which in IIS 7.0 is running in Integrated Managed Pipeline Mode. I created a new ReportServer application pool in Classic Managed Pipeline Mode, put the ReportServer and Reports applications in, and the issue went away.

The SQL Server 2005 Tools
Chris Miller
I still hate them. But not as much as I did. With a massive computer upgrade they're now performing tolerably. Not great, but tolerably. They use something like 118 mb of RAM and take forever to load, but not as long a forever as it used to be. I still think there should be a quick to load lightweight easy to use query tool. I just want something to pull up a query window with load/save file options and a quick link in a menu to books online.

Installation Issue With SQL Server 2005 BPA
Glenn Berry
It looks like someone messed up the build numbers of the SQL Server 2005 BPA between the CTP version and the recently released July version. The CTP version had a build number of 3038, while the July version has a build number of 1004. The setup program detects this, and refuses to install the newer version (which is the correct behavior due to the lower build number).

Microsoft SQL Server Management Pack for Microsoft Operations Manager 2005
Itay Braun
This document is based on the Microsoft SQL Server 2000 Management Pack Guide

When are named sets in the MDX Script evaluated?
Chris Webb
Interesting discovery I made last week: I was tuning a cube and noticed that all my queries, when they were run on a cold cache, were much slower than I was expecting - even the most basic query seemed to take at least 40 seconds. After a lot of head-scratching I looked in Profiler and found the answer to what was going on, and it turned out to be the fact that there were two named sets in the MDX Script that used very complex expressions and which together took 40 seconds to evaluate. Commenting them out reduced all my query times by 40 seconds. The problem was that I wasn't referencing these sets in any of my queries...!

SQL Server 2005 Best Practices Analyzer - test on SSAS database
Vidas Matelis
SQL Server 2005 Best Practices Analyzer (BPA) was available as CTP already for some time, but just a few days ago Microsoft finally released it. I tested how this tool works with Microsoft SQL Server Analysis Services 2005.

Installing SQL Server 2005 on Vista Ultimate
Andy
When installing SQL Server 2005 on Vista Ultimate, you will receive the following warning from the Program Compatibility Assistant. You will receive this warning several times. This is normal. Click the "Run program" button to proceed.

SQL Server 2008 - Using Merge From SSIS
John Welch
In my last post, I talked about using the new MERGE statement in SQL Server 2008. It allows you to merge the data in a result set with a table, based on a join condition that you specify. This is a common scenario in data warehousing, where you often need to take a working table and compare it to a target table to determine if a row should inserted, updated, or deleted. This is often the case for dimension table. Currently, there is no SSIS component in the June 2007 CTP that leverages the MERGE statement directly. This post will explore one possible pattern for using MERGE from SSIS.

Couple RAISERROR with Profiler / Event Notifications - see whats going on!
Tony Rogerson
Leo Pasta gives a good tip on using RAISERROR and NOWAIT, in addition to this in SQL 2005 we can trap the 'User Error Message' EventClass both in SQL Profiler and using Event Notifications.

Database High-Availability: Soup to Nuts
William Brewer
Unpleasant things occasionally happen to production databases. Much of this is due to chance. Occasionally, you'll be unlucky, but your good luck increases with the effort you put into ensuring the robustness of your database. The extent of your occasional misfortune will decrease in direct proportion to the increasing resilience of your database architecture and your level of preparedness.

Reverse Engineering Alias Data Types in SQL Server 2000
Yakov Shlafman
In SQL Server 2005 there is the concept of alias data types, which are similar to user-defined data types in SQL Server 2000. Yakov Shlafman brings us the first part of a series looking at these structures in SQL Server 2000.

XML Workshop - FOR XML PATH
Jacob Sebastian
Quickly becoming the SQL Server XML expert, Jacob Sebastion brings us a great new article that expands upon his very popular series on XML in SQL Server. This time he examines the FOR XML PATH option, which provides additional formatting capabilities.

The Journey to Katmai
Steve Jones
With the next version of SQL Server due to release in 2008, Steve Jones takes a look of some of the more interesting features available in the June 2007 CTP and that should be ready for release next year.

Writing SQL Faster
Tom Fischer
All of us would like to code faster and with fewer mistakes. While a third party tool usualy won't help with the latter, there are a few to help you increase the speed at which you build readable code. New author Tom Fisher brings us a short tutorial on using a couple of Red Gate's tools.

Understanding DDL Triggers in SQL Server 2005
S. Srivathsani
SQL Server 2005 includes a number of new security enhancements to aid the DBA in managing their SQL Server. DDL triggers allow you to trap all kinds of DDL events that occur on your server. S. Srivathsani brings us a look at these new events.

Summer SQL Teaser #5 Update
Denis Gobo
Try to guess the output if you would run this. I tested this on SQL Server 2000/2005 and 2008 (June CTP). I get the same output on all three servers

SSIS Script Task In SQL Server 2008 Can Use VB Or C#
Denis Gobo
Where do I send a thank you letter? Finally we are allowed to use C# in the SQL Server Integration Services Script Task. I always wondered why SQL Server 2005 only uses VB and not C#, you can use C# in the SQLCLR but not in a Script Task. It turns out that SSIS in SQL Server 2005 uses VSA (Visual Studio for Applications) but SQL Server 2008 will use VSTA (Visual Studio Tools for Applications). Lets put these 2 right under each other.

[В начало]

ФОРУМ SQL.RU

Самые популярные темы недели

Вышел русский SQL Server 2005!!!
Защита от дубликатов
select all
Preview upcoming Releases: SQL Server 2008 June CTP now available!
Помогите поднять базу
К-во понедельников в месяце
А кто чем пользуется для резервного копирования?
кто ни будь понимает, чеза хрень творится с сп2?
вопрос о творческом подходе
Слияние данных
SQL 2005. Увеличение размера базы в 5 раз
Помогите разобраться!!!!!!!!!!!!!!!!!
В чем причина неправильной работы хранимки
Сортировка в представлении SQL 2005 после 2000
определить город из строки адреса
SQL Server. Работает быстраее после перезагрузки
Как эффективнее обновить много записей ?
Размер базы и размер таблиц - вопрос
Attach базы, лежащей не на сервере
Автоматическая архивация SQL 2005

[В начало]

Вопросы остались без ответа

SQLXMLBulkLoad
Custom Conflict Resolver : ConflictLogType, @log_conflict?
SSIS Foreach loop - кол-во обработанных файлов
мерж-репликация
Как посмотреть какие jobs работают для данной репликации
MS SQL 2005 (64 bit) vs Oracle
Подзапросы в CASE
Linked Server для Indexing Service на разных машинах

[В начало]

Благодарности

Дорогие читатели рассылки! Прошло семь лет, как вы получаете еженедельные выпуски "MS SQL Server - дело тонкое...", и вот, настал момент, когда вы читаете заключительный, 365-й выпуск. Ровно столько выпусков мной было решено сделать для вас изначально, и эта цель мною сегодня достигнута. Как мне кажется, были достигнуты и другие цели и задачи, которые ставились для этого не коммерческого проекта и которые вы можете найти в его описании. Впрочем, об этом судить скорее вам, а не мне. Я только хотел что-то делать в этом направлении…
Хотелось бы поблагодарить всех тех наших коллег, кто в разное время помогал рассылке, присылая свои статьи или переводы. В первую очередь, хотелось бы отметить вклад соавторов рассылки - Ирины Наумовой и Виталия Степаненко. Не могу не отметить, что этот вклад был замечен, и оба они были успешно номинированы на SQL Server MVP. Большое значение для меня имели советы и пожелания, которые мне часто высказывали участники Russian SQL Server Club, за что я хочу сказать отдельное спасибо. Ну и главное, я хотел бы перечислить тут всех тех, кто в разные годы писал для рассылки (надеюсь, никого не забуду): Баскакова Маргарита, Белов Владимир, Брылёв Вячеслав, Гавриленко Сергей, Главчев Дмитрий, Гречкосий Геннадий, Денисенко Александр, Денищенко Николай, Забалуев Сергей, Зубов Максим, Кривонос Наталья, Лимонов Алексей, Натальченко Андрей, Неупокоев Алексей, Приходько Дмитрий, Рахманов Мартин, Сафонов Алексей, Сердюк Владимир, Синкин Андрей, Смирнов Михаил, Снисаренко Сергей, Уфимцев Глеб и Шуленин Алексей. Также, хотелось бы выразить благодарность издательствам "Питер" и "БХВ-Петербург" за плодотворное и взаимовыгодное сотрудничество.
Как же Вам теперь жить без рассылки? На самом деле, всю информацию я беру из интернета, и только из открытых ресурсов. Всё что для этого нужно - это отслеживать несколько блогов, список которых доступен и мной регулярно актуализируется: МОИ ЛЮБИМЫЕ БЛОГИ. Если будет появляться что-то новенькое, следите за моим блогом, я буду стараться оперативно снабжать Вас интересными новостями. Также, я собираюсь информировать в своём блоге о своих новых статьях или даже публиковать их, как это практикуется в течение последних пары лет.
Рекомендую Вам больше внимания уделять обнародованной русской электронной документации по SQL Server, это самый правильный и планово пополняемый источник информации. Если я буду замечать на страницах этой документации что-то новенькое или интересное, обязательно буду делиться с вами в своём блоге, к чему и вас всех призываю!
Майкрософт теперь в плановом порядке занимается переводом на русский язык всех своих интернет - ресурсов про SQL Server, т.ч. за этим тоже очень полезно следить. Всё идёт к тому, что скоро мы привыкнем к тому, что документация и интерфейсы локализованы, и о нашей любимой СУБД будем говорить только на великом и могучем ;) Вы это уже могли наблюдать, по начавшейся ещё год назад локализации технического центра SQL Server и аналогичного ресурса для разработчиков. Нельзя не отметить и работу по переводу статей технетовского журнала.
Ещё один ресурс, который обещает стать очень полезным, это технетовские форумы по SQL Server. Сейчас ведётся работа в том направлении, чтобы привлечь к ответам на вопросы в этих форумах наших русскоговорящих коллег, которые работают в Microsoft SQL Server Team. Я уверен, скоро Вы будете получать там ответы из первых рук!
Многие из вас уже обратили внимание, что после "перестройки" центра обратной связи Майкрософт, появилась возможность пожаловаться на ошибки или багги в SQL Server 2005 на русском языке. Для того чтобы Ваш вопрос, жалоба или пожелание попали к русскоговорящему специалисту, достаточно начать тему с трёх букв "RUS:".
Вы спросите, а как же SQL.RU? Слова об этом замечательном ресурсе и благодарность Александру Сибилёву я оставил напоследок. Семь лет мы с Александром тесно сотрудничали и наши два проекта очень хорошо друг друга дополняли и поддерживали. Сегодня изменится только одно, не будет больше новых выпусков рассылки, но всё остальное останется, как и прежде. Прежде всего, никуда не денутся публикации на сайте, которые будут пополняться и дальше и, я надеюсь, что не только за счёт моих статей и переводов, а также мы будем и дальше находить там материалы, присланные наиболее активными участниками сообщества SQL.RU. Ну и, разумеется, мы будем встречаться на форумах SQL.RU, который также не останется без моего внимания, и которому я готов уделять своё время в первую очередь.

Задать вопросы и высказаться по поводу рассылки Вы можете в её форуме.

[В начало]

#364<<  #365 (заключительный)
Rambler's Top100 Рейтинг@Mail.ru  Administrator: Обратная связь 
Copyright: SQL.Ru 2000-2013