Upcoming Events @MicrosoftEdu UK

April 23, 2018, 5:50 am

≫ Next: ebook deal of the week: Exam Ref 70-774 Perform Cloud Data Science with Azure Machine Learning

≪ Previous: Productive web resource development for model-driven PowerApps

Microsoft UK are dedicated to ensuring educators are up to date with the latest and greatest in Microsoft technologies and solutions for Education. Here are some of the latest news announcements and upcoming events happening this month!

The Microsoft UK EDU Roadshow returns this week!

Last week saw the awaited return of our 2018 #MicrosoftEdu Roadshow series which were a huge success! We are on the road again this week spreading the word about the great things you can achieve with Microsoft in Education. If you are interested in finding out more about Microsoft in Education then there's still time to get signed up following the links below:

25/04/2018 Oxfordshire: Manor School, 28 Lydalls Cl, Didcot SIGN UP HERE

27/04/2018 Weston: Weston College, Winter Gardens (Italian Gardens Entrance), Royal Parade, Weston-Super-Mare SIGN UP HERE

*Please note: dates are regularly added so check back for updates.

Look out for more updates about Roadshow events series on Twitter online by following the hashtag #MicrosoftEdu or @microsofteduk. Also visit the Microsoft Educator Community UK Roadshow page to find out about the events near you and sign up.

Our aim is to reach every corner of the UK, so if you are able to host a Roadshow in your locality then please contact us on the e-mail: Eduroadshow@microsoft.com.

More than you imagined for less than you thought!

How much would you imagine a school laptop should cost you? Microsoft Education is working to give schools more than they imagined for less than they thought. With many schools now in the process of planning their budgets for the new school year, Microsoft are working to try and support them in their journey to choosing the right devices at an affordable price. Together with the Microsoft Store the power to Unlock limitless learning is now easier than ever with our affordable laptops from £249.

Find out more or purchase now - https://www.microsoft.com/education

What's New in EDU

Educators, edtech enthusiasts and IT professionals: We have a new episode of What’s New in EDU ready for you. This month, our round-up of the latest efforts and products from Microsoft Education catches you up on big range of recent updates to the tools you’ve been using – and perhaps a few that you’ve yet to discover.

Our biggest updates this month include the addition of Picture Dictionary to Learning Tools, which lets students click on any word to see a corresponding image, the integration of a highly-rated Open Up curriculum into Office 365 Education, and convenient new quiz types for instant assessment in Microsoft Forms. We’ve also made it easier to collaborate in Teams and made several useful enhancements to Intune for Education.

Twitter feed Updates for Microsoft Edu UK
Check out what is happening in Microsoft in Education here in the UK by viewing the Microsoft Education Twitter updates below.

Tweets by microsofteduk

So that wraps up this week's What's New in Edu UK and Upcoming Events. Remember to follow @Microsofteduk for all our latest updates daily!

↧

ebook deal of the week: Exam Ref 70-774 Perform Cloud Data Science with Azure Machine Learning

April 23, 2018, 9:00 am

≫ Next: Lesson Learned #42: Creating an alias for my Azure SQL Database Server

≪ Previous: Upcoming Events @MicrosoftEdu UK

Save 50%! Buy here.

This offer expires on Sunday, April 29 at 7:00 AM GMT.

Prepare for Microsoft Exam 70-774—and help demonstrate your real-world mastery of performing key data science activities with Azure Machine Learning services.

Designed for experienced IT professionals ready to advance their status, Exam Ref focuses on the critical thinking and decision-making acumen needed for success at the MCSA level.

Learn more

Terms & conditions

Each week, on Sunday at 12:01 AM PST / 7:01 AM GMT, a new eBook is offered for a one-week period. Check back each week for a new deal.

eBook Deal of the Week may not be combined with any other offer and is not redeemable for cash.

↧

Lesson Learned #42: Creating an alias for my Azure SQL Database Server

April 23, 2018, 11:16 am

≫ Next: Dynamics 365 Retail Return Locations Not Populating Based on the Return Subcode Selected in POS

≪ Previous: ebook deal of the week: Exam Ref 70-774 Perform Cloud Data Science with Azure Machine Learning

Today, I've been working on a service request very interesting.

If you need to create an alias for your Azure SQL Server, you have the PowerShell cmdlet called: New-AzureRmSqlServerDnsAlias.

The syntax to use is:

Login-AzureRmAccount
[string]$SubscriptionId = "xxxxxxxxxxxxxxxxxxxxx"
$azureCtx = Set-AzureRMConteXt -SubscriptionId $SubscriptionId
New-AzureRmSqlServerDnsAlias -ResourceGroupname 'Resource group' -ServerName myserver1 -DnsAliasName myserver12

Remember that there is not needed in the servername and dnsaliasname parameters use the domain .database.windows.net just only you need to specify the servername.

Enjoy!

↧

Dynamics 365 Retail Return Locations Not Populating Based on the Return Subcode Selected in POS

April 23, 2018, 2:17 pm

≫ Next: Is Your Development Staff Ready for Artificial Intelligence?

≪ Previous: Lesson Learned #42: Creating an alias for my Azure SQL Database Server

After installing a hotfix (Ex, KB4090327) for Retail Return Location customers have been having issues with Return Locations not populating.

First, go to Retail > Inventory management > Return locations to create your Return location.

In my example if you select the Defect Subcode (1), the Returned item should go into the CentralHoustonDefected location (only if the item is tracking Locations). The Block inventory check box means that you are not able to sell out of the Defected Location.

The most common reason for the Return Locations not populating is:

The Return Location is not added to the Retail Product Hierarchy.

Go to Retail > Product and categories > Product categories. Select a node that has your product and expand the Manage inventory category properties fast tab. Select your company and set the “Return location”

Please note that there are two ways to get into product categories (through hierarchy setup in products menu and through Retail menu) when you go in through product setup – it doesn’t show the inventory configuration section which needs to be used for this, so users must open the form from retail menus. I do agree that this is very confusing, and this is a big reason why this step is missed and reported as a bug (when this is currently not considered a bug).

The SiteWarehouseLocation is not Active
Verify a customization is not preventing the Return location from populating.

↧

Is Your Development Staff Ready for Artificial Intelligence?

April 23, 2018, 12:51 pm

≫ Next: Known Issues with SSL certificate rotation feature in LCS

≪ Previous: Dynamics 365 Retail Return Locations Not Populating Based on the Return Subcode Selected in POS

In this post, Principal Consultant/ADM Larry Duff discuss some ethical challenges in Artificial Intelligence.

Artificial intelligence has been a dream of computer scientists for many years. I remember my early days of programming I had a Commodore Pet. I was excited that I had a book of programs, I typed them in and saved to my tape drive. One of those programs was ELIZA. I could type in questions and "she" would answer me, I had my own Hal 9000! If I understood the code I typed in (I was only 10) I would have seen it was a rudimentary natural language processor with canned responses. It was just a cheap knock off of a program written earlier at MIT, which for its day was advanced.

It’s not just computer scientists that dream of the computer that helps them, the average person's imagination has been peaked for years, whether they think of it in those terms or not. They have been going to movies about Artificial Intelligence for many years.

2001 A Space Odyssey - Hal 9000 (1968)
Star Wars - C3PO (1977)
Blade Runner - Nexus-6 (1982)
Terminator - SkyNet (1984)
Star Trek Generations - Data (1994)
The Matrix (1999)
Resident Evil - Red Queen (2002)
I, Robot - VIKI (2004)

We use the words, we dream of it, but what really constitutes Artificial Intelligence? According to Stanford Professor John McCarthy who coined the term 'Artificial Intelligence' in 1955,

"It is the science and engineering of making intelligent machines, especially intelligent computer programs. It is related to the similar task of using computers to understand human intelligence, but AI does not have to confine itself to methods that are biologically observable."

At what point do we jump from an algorithm to true intelligence? Let's look at this from a different angle, what makes you or I intelligent? How about another definition:

intelligence: the ability to learn or understand or to deal with new or trying situations : reason;

For software, I'd break it down to "learning and performing actions that aren't in the original programming." Today there are no true artificial intelligence machines, but we continue to get closer with development of neural networks. Will Quantum Computing become so powerful that the algorithm driven predictive software is indistinguishable from true Artificial Intelligence? We had a lot of examples above of Artificial Intelligence in the fantasy world, we are still a long way off from those examples.

With the advances to processing power outpacing Moore’s Law the last few years, the time is right for companies like Microsoft, Alphabet, IBM, and many others to ramp up their push into Artificial Intelligence. AI is being used in real world applications today, some are critical to our society. Some things behind the scenes that are affecting you that you may not know:

Qualifying for a loan from bank
Assessing accuracy of medical diagnosis
Determining who gets recommended for a job

How about things as a consumer that you are seeing every day?

BigBlue - IBMs Chess Computer
Personal Assistants - Siri, Alexa, Cortona
Media Matching - Pandora, Netflix, Spotify
Smart Home - Next, Wink, Honeywell
Autonomous Vehicles

Famous Science Fiction writer Isaac Asimov wrote the book I, Robot (which was made into a movie referenced above) and the subsequent Robot books series. Within the story was the big ethical question which was answered by a set of laws for Robots:

A robot may not injure a human being or, through inaction, allow a human being to come to harm
A robot must obey orders given to it by human beings except where such orders would conflict with the First Law.
A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.

At what point do we need a set of laws like this for Artificial Intelligence? Who will set these laws? Who will enforce these laws? Who is responsible for AI programs when they go bad? In the movie Terminator they blamed the engineer who built SkyNet and thought eliminating him would end judgement day.

We are the beginning of a potentially very deep rabbit hole of the ethics of Artificial Intelligence. Let's say you ask your favorite personal assistant for a restaurant recommendation. It recommends Jane's Bistro, but not because it's highly rated, but because Jane's Bistro paid to be at the top of recommendations. You didn't ask for the Top Rated, so technically they did nothing wrong. But is what they did ethical, or is it buyer beware? There are ways this could get out of control, doctors, lawyers, contractors. What happens when one of these recommendations is horrible and causes harm. Does the maker of the personal assistant bear any responsibility? At some point soon, if not already there is going to be liability attached to these "Artificial Intelligence" machines.

Let's go deeper, an autonomous automobile that hits another car. Who is libel? The owner of the car, the manufacturer… how about the software engineer that wrote the program?

When we move these projects toward artificial intelligence what new roles do we introduce? Will we need to introduce an ethics team? How do you teach ethics to Software Engineers? Do you even try to? As part of Microsoft’s AI research they even have created a separate committee to help called Microsoft’s AI and Ethics in Engineering and Research (AETHER) Committee.

There is a twin issue to ethics that is just as hard to solve, bias. Where you live, how you grew up, your friends, co-workers all make you form opinions which lead to conscious and unconscious biases. Biases will slip into the code for AI, like it or not. It’s been theorized that biases are also introduced in AI through the data collection that is ongoing. How can you program out bias when you don’t even know what bias will be the ghost in the machine?

Did you know that Civil Engineers must typically have a Professional Engineer license in the state they live? When that new building or bridge is designed they take the responsibility for correctness. Increasingly Mechanical Engineers and even Electrical Engineers are being licensed. Should we be looking at licensing software engineers? What do you license them on around AI, bias free code… ethical code?

When you let loose that next chat bot you might need to think of a whole new set of requirements. You have to determine your next steps, I hope I gave you food for thought about those new requirements. I can help you getting technically ready. Check out Microsoft’s AI Platform. Start with a Bot, add some Machine Learning, then jump into Cognitive Services. If you’re looking for a place to start, try Bot Builder SDK for .NET samples. All the samples are available on GitHub and really easy to get running. Looking forward to your next solution!

Premier Support for Developers provides strategic technology guidance, critical support coverage, and a range of essential services to help teams optimize development lifecycles and improve software quality. Contact your Application Development Manager (ADM) or email us to learn more about what we can do for you.

↧

Known Issues with SSL certificate rotation feature in LCS

April 23, 2018, 2:17 pm

≫ Next: Keeping Data Lake Costs Under Control: Creating Alerts for AUs Usage Thresholds.

≪ Previous: Is Your Development Staff Ready for Artificial Intelligence?

On April 22 2018, we released a feature in LCS that enabled Project owners and Environment managers to rotate the SSL certificate on the one-box environments deployed in their own subscription using the Rotate secrets feature. Since then we have had customers ask questions as well as report some issues.

This post will continue to be updated on the issues that the team is currently tracking along with potential workarounds.

Issue: Rotate secrets option does not show up for my environment

Answer: The Rotate secrets option is available to users that are logged in as Project owners or Environment managers for one-box Demo/DevTest environments that are in the Deployed state in customer/partner subscription (not managed by Microsoft). If your environment is in the Stopped state, the Rotate secrets option will not be available. If the above conditions are met and the Rotate secrets option is still not available, open a support incident.

Issue: Rotate secrets completes successfully and environment is usable but the Environment history page in LCS is not updated

Answer: We have found an issue in LCS that is causing the environment history to not be updated in a timely manner. There is a significant delay between the operation completing and the environment history being updated. We are investigating this issue. However, even if the environment history is not updated, the environment is safe to use.
Resolution Status: Investigation in progress.

Issue: SSL certificate rotation completed successfully but the Environment details page still shows the warning message about cert rotation being needed

Answer: We have found a sync issue between LCS and the machine. The issue is causing the warning message to show even though the certification rotation operation completed successfully. If the certificate rotation is complete, you can ignore the warning and use the environment.
Resolution status: Investigation in progress.

Issue: Environment is in Incomplete state after the certificate rotation has been completed

Answer: We are investigating the root cause for this issue. However, as a workaround to fix the environment state issue, first confirm if the certificate rotation completed successfully. To do that, complete the following steps. After you have confirmed that certificate rotation has completed successfully, you can start and stop the environment from LCS to fix the environment state. If there are specific services started on the machine, you must restart those services.

Remote desktop into the machine and launch the IIS Manager.
Go to Sites > AOSService > Site Bindings, and select https/403.
Click Edit, and then click View on SSL certificate.
If the Valid period is listed as 4/12/2018 to 4/12/2020, the certificate has been successfully rotated.

↧

Keeping Data Lake Costs Under Control: Creating Alerts for AUs Usage Thresholds.

April 23, 2018, 2:27 pm

≫ Next: Doing More With Functions: Comment-Based Help

≪ Previous: Known Issues with SSL certificate rotation feature in LCS

Have you ever been surprised by a larger-than-expected monthly Azure Data Lake Analytics bill? Creating alerts using Log Analytics will help you know when the bill is growing more than it should. In this post, I will show you how to create an alert that emails a message whenever the total AUs assigned to jobs exceeds a daily threshold. – it’s easy to get started!

This is another post in a series on how to save money and reduce costs with Azure Data Lake Analytics.

Connect your Azure Data Lake Analytics account to Log Analytics

Follow the steps in our previous blog post on Log Analytics to connect your accounts and start collecting usage and diagnostics logs – in this specific case, make sure you select the Audit logs to create this alert:

Selecting the type of event logs to share to Log Analytics

Create the query: Azure Data Lake Analytics AUs assigned

A simple Azure Log Analytics query showing the recently completed jobs is:

search *
 | where Type == "AzureDiagnostics"
 | where ResourceProvider == "MICROSOFT.DATALAKEANALYTICS"
 | where OperationName == "JobEnded

Log Analytics entry for a completed Azure Data Lake Analytics job

The attribute Parallelism_d in the previous query, contains the total number of AUs assigned by the user on a job that has ended (regardless of status). The following query aggregates the results on a 1-day interval and sums all the values of the Parallelism_d column, returning the total AUs assigned by users to jobs that ended. Note that this is not the total number of AU-hours (we will cover that on a later blog post).

We will use this query to power our Log Analytics alert:

search *
 | where Type == "AzureDiagnostics"
 | where ResourceProvider == "MICROSOFT.DATALAKEANALYTICS"
 | where OperationName == "JobEnded"
 | summarize AggregatedValue = sum(Parallelism_d)  by bin(TimeGenerated, 1d)

Log Analytics query that sums all the AUs assigned by users for jobs in a day

Creating the Log Analytics alert

If you want to see the step-by-step guide to create a new Log Analytics alert, check out our recent blog post on creating Log Analytics Alerts.

For the alert signal logic, use the following values:

Use the query from the previous step
Set the sum of AUs to 50 as the threshold (you can use any number that reflects your own threshold)
Set the trigger to 0: whenever the threshold is breached
Set the period and frequency for 24 hours.

Alert signal logic and settings

After one or more jobs exceed the AU threshold set in the alert within a 24 hour period, the users/teams in the alert action group will get an email alert:

Alert notification email

Conclusion

In this blog post I showed you how to set up and alert whenever a specific threshold of AUs is exceeded. This usage alert can directly help to manage costs and understand the Data Lake Analytics usage in your organization - create your notifications and share your experiences with us!

In future posts, we'll cover other useful alerts and notifications that can be set up for your Azure Data Lake Analytics and Data Lake Store accounts. Go ahead and set your own usage alerts -- it's easy to get started!

↧

Doing More With Functions: Comment-Based Help

April 24, 2018, 1:01 am

≫ Next: [Post invitado] Part 3. Step by step – How to train an objects classifier understanding Computer Vision techniques with Python and OpenCV

≪ Previous: Keeping Data Lake Costs Under Control: Creating Alerts for AUs Usage Thresholds.

I just wanted to throw together a post highlighting how cool and easy it is to add help data to your own Functions and scripts.

The help data gets added via comments. For functions the help data can go in three places:

Before the function keyword (I like it up here)
Between the open curly brace and the param() statement
At the bottom of the function before the closing curly brace (I hate this spot)

For scripts we just put it at the top of your script before you type the param() statement, or at the bottom, but the bottom will mess with code signing.

Syntax just involves using a dot before the help keyword and then typing the help you want for it on the next line:

&lt;#
.&lt;keyword&gt;
&lt;help data for it&gt;
#&gt;

Here is a list of all the keywords I found and short descriptions of them:

&lt;#
.SYNOPSIS
A brief description of the function or script. 
This keyword can be used only once in each topic.

.DESCRIPTION
A detailed description of the function or script. 
This keyword can be used only once in each topic.

.PARAMETER &lt;Parameter-Name&gt;
The description of a parameter. 
Add a &quot;.PARAMETER&quot; keyword for each parameter in the function or script syntax.

.EXAMPLE
A sample command that uses the function or script, optionally followed by sample output and a description. 
Repeat this keyword for each example.

.INPUTS
The Microsoft .NET Framework types of objects that can be piped to the function or script. 
You can also include a description of the input objects.

.OUTPUTS
The .NET Framework type of the objects that the cmdlet returns. 
You can also include a description of the returned objects.

.NOTES
Additional information about the function or script.

.LINK
The name of a related topic. 
The value appears on the line below the &quot;.LINK&quot; keyword and must be preceded by a comment symbol # or included in the comment block.
Repeat the &quot;.LINK&quot; keyword for each related topic.

The &quot;Link&quot; keyword content can also include a Uniform Resource Identifier (URI) to an online version of the same help topic. The online version opens when you use the Online parameter of Get-Help. The URI must begin with &quot;http&quot; or &quot;https&quot;.

.COMPONENT
The technology or feature that the function or script uses, or to which it is related. 

.ROLE
The user role for the help topic. 

.FUNCTIONALITY
The intended use of the function. 

.FORWARDHELPTARGETNAME &lt;Command-Name&gt;
Redirects to the help topic for the specified command. 
You can redirect users to any help topic, including help topics for a function, script, cmdlet, or provider.

.FORWARDHELPCATEGORY &lt;Category&gt;
Specifies the help category of the item in &quot;ForwardHelpTargetName&quot;. 
Valid values are &quot;Alias&quot;, &quot;Cmdlet&quot;, &quot;HelpFile&quot;, &quot;Function&quot;, &quot;Provider&quot;, &quot;General&quot;, &quot;FAQ&quot;, &quot;Glossary&quot;, &quot;ScriptCommand&quot;, &quot;ExternalScript&quot;, &quot;Filter&quot;, or &quot;All&quot;. 
Use this keyword to avoid conflicts when there are commands with the same name.


.REMOTEHELPRUNSPACE &lt;PSSession-variable&gt;
Specifies a session that contains the help topic. 
Enter a variable that contains a &quot;PSSession&quot;.

.EXTERNALHELP
Specifies an XML-based help file for the script or function.

#&gt;

For the most part, I recommend using a few of the most common and important ones, which I think are:

Description
Parameter for every unclear parameter
Examples to help your users

Let's try it

&lt;#
.Description
This function takes in a message and writes it out to the screen in cyan.

.Parameter Msg
This is the message that will be written to the screen in cyan

.Example
HelpTest -msg &quot;Hello world&quot;

.Example
HelpTest &quot;Hello World&quot;

#&gt;

function HelpTest
{
    Param($msg)
    Write-Host $msg -ForegroundColor Cyan
}

get-help helptest -showwindow

Notice PowerShell numbers the examples for us in case we add more or move them around.

To make tacking on the help data easier, I recommend you utilize snippets (ctrl+J OR edit->snippets) and make a custom one.

$helpText = @&quot;
&lt;#
.Description

.Parameter 

.Example

#&gt;
&quot;@

New-IseSnippet -Title &quot;Help (Simple)&quot; -Text $helpText -Author KoryT -Description &quot;simple comment based help&quot;

Well, that's all for now, hopefully this helps you polish off your tools and write help data the right way!

This was originally written as part of my "PowerShell For Programmers" series, though I might link it from other stuff in the future as well.

Let me know in the comments if there is any particular topics you're looking for more help with!

If you find this helpful don't forget to rate, comment and share

↧

[Post invitado] Part 3. Step by step – How to train an objects classifier understanding Computer Vision techniques with Python and OpenCV

April 23, 2018, 6:00 pm

≫ Next: What’s New in EDU- Intune for Education Special

≪ Previous: Doing More With Functions: Comment-Based Help

In the previous post I explained how to create your own image detector with TensorFlow. It should be noted that you must differentiate between a classifier and an image detector. So, what is the difference between Object Detection and Object Recognition! Well, recognition simply implies establishing whether an image contains a specific object or not. while detection also requires the position of the object within the image. For example, there is an entry image that contains car, traffic lights, people, dogs, etc. The task is to be able to recognize which of the objects are contained in the image.

In this new post we will explain step by step how to create your own image classifier, not detection image as we did in the previous post. This time without using third-party technologies. If you have a computer with enough computer level and you also have a GPU like Nvidia GTX 650 or newer, you will not have any problems but also in the case that you do not have a very powerful computer you can follow this post and make a classifier without problems.

How to set up your virtual machine on Linux with Python and OpenCV

The classifier can be developed as we want, we can do it:

Data Science Virtual Machine for Linux(Ubuntu) on Azure

In the first place using virtual machines in Azure, already preconfigured. If we have deployed our machine in Azure as in the first post:

https://blogs.msdn.microsoft.com/esmsdn/2018/04/02/post-invited-analysis-and-object-detection-of- artworks-with-tensorflowgpu-on-windows-10/

This time we can do the same but this time we will change OS instead of Windows, we will develop everything with Linux. Why? My intention is to show you that in machines with Windows 10 and Linux Ubuntu we can carry out our projects of machine learning with python without problems. Thanks to Anaconda we can create without problems our environment variables that will help us to carry out the same processes in both systems. You can see more information here:

https://azuremarketplace.microsoft.com/en-us/marketplace/apps/microsoft-ads.linux-data-science-vm-ubuntu

https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/dsvm-tools-overview

Image Data Science Virtual Machine Linux (Ubuntu) summary

We can see that this VM support many of the current deep learning environments and comes with CUDA and cuDNN installed, although many times without configuring paths. In the first post we explain how to configure these tools paths in our pc. For that case we only need to have Python environment 2.7, this VM default use 3.5 env. Also, we will not have to necessarily use a graphic card to develope this classifier of Artworks. Therefore, the idea of develope this classifier in a virtual machine of this dimension is not the best idea or profitable.

In the case that we want to use this option we have two options to develope the classifier project, one is connecting directly with PUTTY to the virtual machine and command line to organize and launch the commands to configure anaconda and the rest of the project. Another option is to connect by X2GO, it provides us UI rather friendly and close to a desktop interface. Be free to choose!!

Option X2GO

We need craate a new session first

Then we need to fill this form:

Host: The host that azure give us to connect
User: The user that you assigned when you create your machine on Azure
Port SSH: 22
Type of session: Change KDE to KFCE

Since we have X2GO with our session set up we can start to start our project!

The first thing we must do is configure our environment in python with the libraries that our project files need. To create the environment variable, we can launch this command:

But there we are not assigning any library or what version of python we want, by default it will not assign the latest version and for the case of this project we do not need it but the 2.7. Therefore we need to write this command:

Once we have our environment already configured in anaconda with python 2.7 we will be able to start running our .py files that will be explained later.

Our PC or Laptop

To carry out the project from your PC, first you must download and install Anaconda:

https://www.anaconda.com/download/

It is very easy to start with python. Thanks to anaconda we can set up our environments easily and it's done in the same way that we explained earlier. Here you have a link to the anaconda documentation, it explains you how to configure your environments:

https://conda.io/docs/user-guide/tasks/manage-environments.html

Download this tutorial's repository from GitHub

Download the full repository located on this page, scroll to the top and click Clone or Download and extract all the contents directly into the C: directory. This establishes a specific directory structure that will be used for the rest of the post. At this point, your project folder should looks like:

[IMAGE]

Image Github repository solutions. API, App, Object Detection and Object classifer.

Solutions:

The first two solutions (API and App) we will explain in the last sections of this tutorial. The latter is the one that we will explain more in detail.

Artworks Classifier Solution

Image many artworks of the classsifier

This folder contains the images of 20 artworks, python files needed to train artworks classifier. It also contains a CoreML model if you wanted to use on iOS project with Xcode, it was generated with Coremltools library

If you want to practice training your own artwork classifer, you can leave all the files as they are. You can follow along with this tutorial to see how each of the files were generated, and then run the training. You will still need to generate the K-Means Cluster model and SVM model as described in nexts steps.

If you want to train your own object classifier, delete the following files (do not delete the folders):

All files in imagestrain

Now, you're ready to start from scratch by training your own object classifier. This publication will assume that all the files listed above are unknown and will continue to explain how to use these files for your own training data set. Also, in this tutorial I explain important topics that are needed to complete this tutorial, knowing them we will be easier to understand the why of the processes. Such as:

Machine learning techniques
Algorithms
Image descriptors
Histograms and color histograms for artworks
K-Mean Clustering
Bag of Visual Words

Introduction to Machine Learning Techniques (Computer Vision)

Machine learning is related to computer science specifically with Artificial Intelligence. Within the latter the goal of machine learning is to create methods that allow computers to have the ability to learn.

[IMAGE]

Within machine learning we find a subarea that we will use to create this project, called Computer Vision, it is an area that includes different functionalities such as acquiring, processing, analyzing and understanding real-world images in order to translate an image to numerical or symbolic information understandable to a computer. The acquisition of data is achieved by means such as camera, multidimensional data, scanners ... etc.

Today this discipline is used for different purposes, such as:

Object detection
Video analysis
3D Vision

We will only focus on the first option, detection of objects in an image. The detection of objects is a part of the artificial vision that detects objects in an image based on their visual appearance. Within the detection of objects in an image we can distinguish two phases:

Extraction of image characteristics
- Consists of obtaining mathematical models that summarize the content of the image. These characteristics are also called descriptors, we will comment more in detail in next sections.
Search for objects based on these characteristics
- For the object search process, we will have to elaborate the classification of said objects. There are several machine learning algorithms that will allow us to assign an identifier or label to an image. We will discuss it in future sections.

Therefore, the classification of an image is the task of assigning a label to an image of a predefined set of categories. This means that, given an input image, our task is to analyze the image, return a label that categorizes the image. This tag is usually from a predefined set.

As, for example, if in our model classifier we do an inference with this image:

[IMAGE]

Image “Las Meninas”

It would have to return:

Label: Las Meninas

In a more formal way, given the previous image of input W x H pixels, with three channels, R (Red) G (Green) B (Blue), respectively, our objective will be to take the pixels W x H x3 = N and find out how to accurately classify the contents of the image.

In addition, we must realize that in computer vision we must give value to semantics. For a human it is trivial that he knows that the previous image is a picture of art, specifically the Meninas. But in a computer, it is not trivial, nor does it have to know it. For the computer to analyze them, it will differentiate three main properties of every image:

Spatial environment
Color
Texture

These properties are encoded in a computer thanks to what we previously named as descriptors, each descriptor specializes in space, colors, textures, etc. Finally, based on the computer with these characterizations of space, textures and colors, you can apply automatic learning to learn how each type of image is, in our case, diverse types of art pictures of different artists.

For this we also must understand how computers represent images to analyze them. As for example if we insert the picture of the Mona Lisa or Mona Lisa in our classifier, it would represent it like this:

[IMAGE]

Image Giocona and her features descriptors

In addition, another section to investigate on how to build our classifier of art pictures is how the image or an object appears in an image. That is, the different points of view of a painting, different dimensions of the pictures, deformations of the image, lighting and occlusions.

1.4.2 Types of learning

When carrying out the research prior to the realization of the project we have encountered this problem, what kind of learning we want for our project. We have observed that there are three types of learning:

Supervised
- We have both image data (in image format or extracted feature vectors) along with the category label associated with each image so that we can teach our algorithm how each image category looks.
Non supervised
- All we have is data from the image itself, we do not have labels or associated categories that we can use to teach our algorithm to make accurate predictions or classifications.

Semi-supervised
- Try to be a middle ground between the previous two. We have a small group of our tagged image data and use that tagged information to generate more training data from the unlabelled data.

In the end we opted to use the type of supervised learning since in our system we will have an image dataset of 20 tables where the category or label is the name of the table and each category contains 21 replications of that table. Something like our classifier would be:

Label	Features Vector
Las Meninas	[…]
3 de Mayo	[…]
Maja desnuda	[…]
Noche estrellada	[…]
Mona Lisa	[…]
…	[…]

Features vector is something like this:

Starry Night (Vincent Van Gogh):

[IMAGE]

1.4.3 Pipeline of the image classifier

Once seen how to manipulate the images and how to face our learning we will have to investigate on how to divide in processes to build our classifier of images of art pictures. One of the most common pipelines is the following one in which we distinguish 5 phases that are:

[IMAGE]

Phase 1: Structure our initial dataset

It will be necessary to create our categories each with their images and specific labeling. In this project it would be like this:

Categories = {Las Meninas, Shooting 3 of May, Gioconda, Maja Desnuda, Starry Night, …}

Phase 2: Splitting our dataset (train and test)

Once we have our initial data set, we must divide it into two parts, training set and evaluation set or tests. Our classifier uses a training set to "learn" what each category looks like by making predictions about the input data and then correct it when the predictions are incorrect. After the classifier has been trained, we can evaluate its performance in a test set.

[IMAGE]

Algorithms like the ones we use in this Random Forest Classifier project, have many configurable parameters that will help us if we configure them well to obtain optimal performance. These parameters are called hyperparameters.

Phase 3: Extract features

Once we have our final data divisions, we will need to extract functions to quantify and abstract each image. The most common options according to the previous investigation are:

Color descriptors
Histogram of gradients oriented (HOG)
Histograms with local binary patterns (BRIEF, ORB, BRISK, FREAK)
Local Invariant Descriptors (SIFT, SURF, RootSIFT)

Phase 4: Train our classification model

Given the feature vectors associated with the training data, we will be able to train our classifier. The objective here is for our classifier to learn how to recognize each of the categories in the data of our label. For this we have done an analytical study, such as Support Vector Machine, K-Nearest Neighbor.

Phase 5: Evaluate our classifier

Finally, we must evaluate our trained classifier. For each one of the vectors of characteristics in our test set, we present them to our classifier and we ask you to predict which is the label of the inserted image. Then we will have to tabulate the classifier's predictions for each point in the test set.

Finally, these classifier predictions are compared with the Ground-true label of our test set. The Ground-truth labels represent what the category really is. From there we can calculate the number of predictions that our classifier was successful and calculate aggregate reports such as Precision, Recall and F-Measure, which are used to quantify the performance of our classifier.

Classifications Algorithms (SVM and Random Forest)

We can find in the Machine Learning environment a great diversity of algorithms that we can use for our classification of art frames. After a series of tests between SVM, Decision three, K-Nearest Neighbor and Random Forests, we have chosen the latter due to the results we have found when we need to classify 5 pictorical styles. When our classifier classifies artworks, we used SVM.

Random Forest

We have investigated a bit how this algorithm works in detail. It was created and introduced into the scientific community by Leo Brieman in his 2001 article, Random Forests is one of those algorithms that many scientists still do not believe works, many say it is an elegant and straightforward way to perform a classification, others They say that there are multiple Decision trees but with a touch of randomness that clearly increases their accuracy.

From the satisfactory results and the good comments on it, it has been decided in this section to show the functioning of said algorithm. The Random Forests are a type of method of classification by sets, instead of using a single classifier as we have done in the tests with SVM and K-NN, this uses multiple classifiers that are added in one called goal classifier. In our case we will build multiple decision trees in a forest and then we will use our forest to make predictions.

[IMAGE]

As you can see in the previous figure, our random forest consists of multiple grouped decision trees. Each decision tree "votes" on what it believes is the final classification. These votes are tabulated by the classifying goal, and the category with the most votes is the final classification.

You have to make an appointment to the "Jensen Inequality" to understand a large part of how Random Forests works. Dietterich's seminal work (2000) details the theory of why ensemble methods can generally obtain greater precision than a single model alone. This work depends on Jensen's Inequality, which is known as "diversity" or "decomposition of ambiguity" in the machine learning literature.

The formal definition of Jensen's Inequality states that the combined (average) convex set will have an error that is less than or equal to the average error of the individual models. It may be that an individual model has an error smaller than the average of all the models, but since there is no criterion that we can use to "select" this model, we can be sure that the average of all the models will not be worse than the Select any random individual model.

Another crucial factor is boostrapping or randomization injection. These classifiers train each individual decision tree in an initial sample of the original training data. Boostrapping is used to improve the accuracy of machine learning algorithms while reducing the risk of overfitting.

In the following figure we simulate the selection of "votes" created in the nodes of our classifier. We pass a feature input vector and through each of the decision trees, you will receive the votes of the class label of each of the trees, then a count of the votes will be done to show the final classification or prediction.

[IMAGE]

In conclusion, we have used this classifier, since it is a set method that consists of multiple decision trees. The ensemble methods, such as the Random Forests, tend to obtain greater precision than other classifiers, since they averaged the results of each individual model. The reason why this average works is due to Jensen's Inequality.

Randomness is introduced in the selection of training data and in the selection of the characteristic column when training each tree in the forest. As Brieman discussed in his article Random forests, performing these two levels of random sampling helps (1) avoid overfitting and (2) generates a more precise classifier.

Support Vector Machine

The reason why SVMs are so popular is because they have quite solid theoretical foundations. The hyperparameters are still being improved, but in general, launching an SVM to a problem is an effective way to quickly obtain a prediction or a good result for a given problem. However, with what has given me problems with the adjustment of the parameters, if you want to obtain an optimal result you need to play with them and adjust it to the maximum to our problem, for this I found a Python library GridSearchCV that helps you to program the parameters that you want to touch and automate a workout executing all the possible combinations of said hyperparameters, with this we will save a lot of time in going testing different values one by one.

Types of SVM

We will only explain how the linear SVM type worksLinear separability

Linear separability

In order to explain SVMs, we should first start with the concept of linear separability. A set of data is linearly separable if we can draw a straight line that clearly separates all data points in class #1 from all data points belonging to class #2:

[IMAGE]

Image. Given our decision boundary, I am more confident that the highlighted square is indeed a square, because it is farther away from the decision boundary than the circle is.

Take a few seconds and examine this plot and convince yourself that there is no way to draw a single straight line that cleanly divides the data points, so all blue squares are on one side of the line and all red circles on the other. Since we cannot do that, this is an example of data points that are not linear separable.

*Note: As we’ll see later in this lesson, we’ll be able to solve this problem using the kernel trick.

In the case of Plots, A and B, the line used to separate the data is called the separating hyperplane. In 2D space, this is just a simple line. In a 3D space, we end up with a plane. And in spaces > 3 dimensions, we have a hyperplane.

Regardless of whether we have a line, plane, or a hyperplane, this separation is our decision boundary or the boundary we use to decide if a data point is a blue rectangle or a red circle. All data points for a given class will lay on one side of the decision boundary, and all data points for the second class on the other.

Keeping this in mind, wouldn’t it be nice if we could construct a classifier where the farther a point is from the decision boundary, the more confident we are about its prediction?

Images Descriptors

A very important part of our project, not to say the main one, is to know how to extract the descriptors of the images in the best way and efficiently possible. Before going into explaining the world of descriptors, basic concepts in the field of computer vision will be briefly explained and many professionals overlook such concepts, such as image descriptors, feature descriptors and feature vectors. All these terms are very similar, but nevertheless it is very important to understand the difference between them.

To begin with, a feature vector is simply a list of numbers used to abstractly quantify the content of an image. Characteristic vectors are transmitted to other computer vision programs, such as the creation of an automatic learning classifier to recognize the contents of the image using feature vectors or comparing feature vectors for similarity when constructing an image search engine. To extract feature vectors from an image, we can use image descriptors or feature descriptors.

An image descriptor quantifies the complete image and returns a vector of characteristics per image. The descriptions of the images tend to be simple and intuitive to understand but may lack the ability to distinguish between different objects in the images.

On the contrary, a feature descriptor quantifies many regions of an image, returning multiple feature vectors per image. Feature descriptions tend to be much more powerful than simple image descriptors and more robust to changes in rotation, translation and point of view of the input image.

An impediment that arose when extracting feature descriptors is that not only do we have to store several feature vectors per image, which increases our storage overhead, but we also need to apply methods such as Bag of Visual Words to take the multiple vectors of characteristics extracted from an image and condensed into a single vector of characteristics. This Bag Visual Words technique will be explained in another section.

Within the descriptors there are several types, but for our case we will use the local invariant descriptors, consisting mainly of:

SURF (Speeded Up Robust Features), Keypoint detector DoG
SIFT (Scale-invariant Feature Transform), Keypoint Detector Fast-Hessian

We will explain the operation of each one of them with an example of execution with the art pictures and we will explain the reason of our choice of SIFT through results represented in graphs.

Understanding Local features

The most usual when analyzing an image above is to use image descriptors to complete image, this leaves us with a global quantification of the image. However, a global quantification of the image means that each pixel of the image is included in the calculation of the vector characteristics and may not always be the most appropriate.

Suppose for a second that we were tasked with building a computer vision system to automatically identify the covers of books. Our system would take a photo of a artwork cover captured from a mobile device such as an iPhone or Android, extract features from the image, and then compare the features to a set of artworks covers in a database.

We can use HOG for resolve this problem? Or LBP’s?

The problem with these descriptors is that they are all global image, and if we use it we will end up by quantifying image regions that do not interest us, such as people who are in front of the frames. Including these regions in our vector calculation can dramatically divert the vector of output characteristics and we run the risk of not being able to correctly identify the art box.

The solution to this is to use local characteristics, where we only describe small local areas of the image that are considered interesting instead of the whole image. These regions must be unique, easily compared and carry some kind of semantic meaning in relation to the contents of the image.

*Note: At the highest level, a “feature” is a region of an image that is both unique and easily recognizable.

Keypoint detection and feature extraction

The process of finding and describing interesting regions of an image is broken down into two phases: keypoint detection and feature extraction.

The first phase is to find the “interesting” regions of an image. These regions could be edges, corners, “blobs”, or regions of an image where the pixel intensities are approximately uniform. There are many different algorithms that we’ll study that can find and detect these “interesting” regions — but in all cases, we call these regions keypoints. At the very core, keypoints are simply the (x, y)-coordinates of the interesting, salient regions of an image.

Then for each of our keypoints, we must describe and quantify the region of the image surrounding the keypoint by extracting a feature vector. This process of extracting multiple feature vectors, one for each keypoint, is called feature extraction. Again, there are many different algorithms we can use for feature extraction, and we’ll be studying many of them in this module.

However, up until this point, we have had a one-to-one correspondence between images and feature vectors. For each input image, we would receive one feature vector out. However, now we are inputting an image and receiving multiple feature vectors out. If we have multiple feature vectors for an image, how do we compare them? And how do we know which ones to compare?

As we’ll find out later in this tutorial, the answer is to use either keypoint matching or the bag-of-visual-words model

Research on SIFT and SURF

First, we explain what SIFT is. SIFT descriptor is a lot easier to understand than the Difference of Gaussian (DoG) keypoint detector also proposed by David Lowe in his 1999 ICCV paper, Object recognition from local scale-invariant features. The SIFT feature description algorithm requires a set of input keypoints. Then, for each of the input keypoints, SIFT takes the 16 x 16-pixel region surrounding the center pixel of the keypoint region.

The SURF descriptor is a characteristic vector extraction technique developed by Bay in its 2006 ECCV document. It is very similar to SIFT, but it has two main advantages with respect to SIFT.

The first advantage is that SURF is faster to calculate than SIFT, so it is more suitable for real-time applications.
The second advantage is that SURF is only half the size of the SIFT descriptor. SIFT returns a feature vector of 128-dim and SURF returns a vector of 64-dim.

Having understood its advantages, we have decided to set an objective and it is to understand why SURF works and because it obtains such satisfactory results, for this we have had to understand a keypoint descriptor called Fast Hessian. Today, both the SURF image description algorithm and the Fast Hessian keypoint descriptor ended up calling both by the scientific community "SURF" although they may be confusing.

Initially before SIFT and SURF were introduced there was a trend in which the proposed algorithms included a keypoint detector and an image descriptor, so sometimes we see both keypoint detectors and image descriptors share the same name.

But how does the Fast Hessian keypoint descriptor work?

The motivation of Fast Hessian and SURF came from the slowness of DoG and SIFT. The computer vision researchers wanted a faster keypoint detector and image descriptor.

The Fast Hessian is based on the same principles as DoG in that keypoints must be repeatable and recognizable at different scales of an image. However, instead of calculating the Gauss Difference explicitly as done in DoG, Bay proposed that instead we approximate the Gaasian Differences step using what is called Haar Wavelets and integral images. We will not go into detail about this technique, but in the following image you can see how it would become a theory:

[IMAGE]

And from the previous result of building a matrix:

[IMAGE]

A region will be marked with a keypoint if the candidate pixel score is 3 x 3 x 3 times greater than the neighbor. Unlike SIFT, this time we are only interested in maximums, not maximums or minimums. The use of Fast Hessian for our application is the best choice since it is very appropriate for a real-time approach. In the following image we have tested an image of "The persistence of memory" of Dali executing on this Fast Hessian and extracting its keypoints:

[IMAGE]

Image Fast Hessian Keypoints of “Persistence of memory”

Once these keypoints are extracted our image description algorithm, SURF, we will go through each one of the keypoints obtained and we will divide the keypoint region into 4 x 4 sub-areas, just like in SIFT.

[IMAGE]

From this step is when SIFT and SURF begin to differentiate. SURF for each of these sub-areas of 4 x 4 extracts by means of the Haar Wavelet sample points of 5 x 5.

[IMAGE]

And for each extracted cell, both directions X and Y are processed by Haar Wavelets. This result is known as d_ {x} and d_ {y}.

[IMAGE]

Now that we have d_ {x} and d_ {y}, we will extract their weights by means of what is called as Gaussian Kernel as in SIFT. The results furthest from the center of the keypoint will contribute less to the vector of final characteristics, however, the results that are closer to the center of the keypoint will contribute to the vector of final characteristics. And finally, SURF to finish its process and finally process its characteristics vector for each sub-area of 4 x 4 process this formula:

[IMAGE]

Therefore, as we said at the beginning we have 4 x 4 = 16 subareas, returning a 4-dim vector to us. These characteristic vectors with 4-dim will concatenate with each other, giving rise to 16 x 4 = 64-dim of the characteristic vector of SURF. Below are results with paintings by Velázquez (Las Meninas) and DaVinci (The Last Supper).

[IMAGE]

Image SURF keypoints of “Las Meninas”

[IMAGE]

Image SURF Keypoints of “The Last Supper"

Histograms and color histograms for artworks

Within the fundamental concepts in computer vision we must give weight to the histograms. So, what exactly is a histogram? A histogram represents the distribution of pixel intensities (whether color or gray- scale) in an image. It can be visualized as a graph (or plot) that gives a high-level intuition of the intensity (pixel value) distribution. We are going to assume a RGB color space in this example, so these pixel values will be in the range of 0 to 255. And now what is a color histogram? Simply put, a color histogram counts the number of times a given pixel intensity (or range of pixel intensities) occurs in an image. Using a color histogram, we can express the actual distribution or “amount” of each color in an image. The counts for each color/color range are then used as our feature vector.

[IMAGE]

Image Athens School artwork

[IMAGE]

Image Histogram of Athens School artwork

We see there is a sharp peak in green and red histogram around bin 200, the end of the histogram represents that the men on the right with the toga with "green" and "red" tonality contain many pixels with that color range. We see that most of red, green, blue pixels are container in all the range except in final when blue decrease.

*Note: I’m intentionally revealing which image I used to generate this histogram. I’m simply demonstrating my thought process as I look at a histogram. Being able to interpret and understand the data you are looking at, without necessarily knowing its source, is a good skill to have in computer vision.

Now in the image below que can see a grapchic of computing 2d color histograms for each combination of the red, green, and blue channels. First is a 2D color histogram for the Green and Blue channels, the second for Green and Red, and the third for Blue and Red.

[IMAGE]

Image computing 2d color histograms for each combination of the red, green, and blue channels.

K-Means Clustering

Once we understand how we extract the color histograms thanks to K-means we can group colors and know which the predominant ones in each painting, artist or pictorial style are. In addition to relating a feeling with the colors that we extract from a painting, such as happy or sad.

There are many clustering algorithms in ML but k-means after investigation I discovered that it is the most popular, most used and easiest to understand clustering algorithm. The K-Means algorithm is a type of unsupervised learning algorithm (no label/category information associated with the images/feature vector), I was explained this concept before in Machine learning techniques chapter.

Clustering algorithms seek to learn, from the properties of the data, an optimal division or discrete labeling of groups of points. Scientist call K-means because it finds “K” unique clusters where the center of each cluster(centroid) is the mean of all values in the cluster. The overall goal of use k-means is to put similar points in a cluster and dissimilar data points in a different cluster. For more information about clustering techniques enter in this link.

Many clustering algorithms are available in Scikit-Learn and elsewhere, but perhaps the simplest to understand is an algorithm known as k-means clustering, which is implemented in sklearn.cluster.KMeans.

Applications of k-means in computer vision

In my project the k-means algorithm can also be used to extract the dominant colors of an artworks like extract the top 5 colours used by Ziem Felix in his 20 more relevant artworks. However, we can also extract features for only one artwork or a style of art, only changing the dataset.

[IMAGE]

Image example of artwork of Ziem Felix (Romanticism)

Here I show you just one picture of Felix Ziem but in almost all we can see the same colors, intuitively we can distinguish the colors that predominate in this author but thanks to K-means we can get the correct colors like the following:

[IMAGE]

Image Top 4 Colors predominant in Felix Ziem artworks

You will think it is complicated but in a few lines of code knowing what we do with python we get it quickly. Only we need to take the raw pixel intensities of the image dataset as our data points, pass them on to k-means, and let the clustering algorithm determine the dominant colors. That’s all!

Example of extract Top 3 colors in only one artwork:

[IMAGE]

Image Vincent Van Gogh artwork Top 3 [n_clusters] colors (Cafe Terrace)

[IMAGE]

Image Salvador Dali artwork Top 5 [n_clusters] colors (Persistence of memory)

In addition, reading a little about color analysis articles, several modern artists and current psychologists have in many cases managed to understand what feelings and emotions had the artists or pictorial styles. Graphics like these explain the feelings and aptitudes of some colors and with the extraction of these colors, thanks to K-Means, we can relate them to this table.

[IMAGE]

Image example plot of color emotions

However, the most popular usage of k-means in computer vision/machine learning is the bag-of-visual words (BOVW) model where we:

Extract SIFT/SURF (or other local invariant feature vectors) from a dataset of images.
Cluster the SIFT/SURF features to form a “codebook”.
Quantize the SIFT/SURF features from each image into a histogram that counts the number of times each “visual word” appears.

Classification with Bag of Visual Words

This section is one of the most important! In the previous sections we have explained several very important concepts in computer vision to understand now the processes that are needed to carry out our image classifier, specifically artworks. But before I start to explain the classifier code, I will explain above what it means Bag of Visual words.

What is BOVW?

To begin explaining what this technique consists of I would like to mention the world of natural language processing or also known as NLP, where our intention is to compare and analyze multiple documents. Each document has a lot of different words and in a certain order. With this technique we ignore the order

and simply throw the words in a "bag". Then once all the words inside the bag we can analyze the occurrences of each word. Finally, with the NLP, each document becomes a histogram of word counts and can be used as characteristics for a ML process.

Visual Words

In the section on image descriptors, explain what they are and how they are generated (SIFT, SURF). If we think of an image as a document of "words" generated by SIFT, we can extend the Bag Visual Words model to classify images instead of text documents. We can imagine that a SIFT descriptor or "Visual Word" represents an object of the picture, as for example a descriptor can be the eye of the Mona Lisa.

These descriptors of SIFT have variations so we must have some grouping method to group words that represent the same. For example, all the characteristics of the eye of the Gioconda must go to the same cluster or container.

However, the characteristics of SIFT are not as literal as saying "eye of the Gioconda". We can not agrpar them according to a human definition but mathematically. For this the descriptions of SIFT are 128-dimensional vices so we can simply make a matrix with each SIFT descriptor in our training set as its own row, and 128 columns for each of the dimensions of the SIFT characteristics. Once all this has been done, we will have to connect that matrix to an aggregating algorithm, such as the K-Means explained above.

[IMAGE]

Image Example Bag Visual Words

Next we go through each individual image, and assign all of its SIFT descriptors to the bin they belong in. All the “eye” SIFT descriptors will be converted from a 128-dimensional SIFT vector to a bin label like “eye” or “Bin number 4”. Finally we make a histogram for each image by summing the number of features for each codeword. For example, with K=3, we might get a total of 1 eye feature, 3 mouth features, and 5 bridge features for image number 1, a different distribution for image number 2, and so on.

*Note: Remember, this is just a metaphor: real SIFT feature clusters won’t have such a human-meaningful definition.

At this point we have converted images with varying numbers of SIFT features into K features. We can feed the matrix of M observations and K features into a classifier like Random Forest, AdaBoost or SVC as our X, image labels as our y, and it ought to be able to predict image labels from images with some degree of accuracy.

But how can I know how many bins I need? And what is K?

For this artworks classifier, I performed a grid search across a range of K values and compared the scores of classifiers for each K. The code is on github and references other files in the repo.

*Note: Determining what is K to use for K-Means depending on your project and how long it takes to do a grid search, you might want try differents methods.

Prepare our classifier

Once we have understood the theory, we move on to the practical part to train our classifier and obtain our model to use it in our mobile application developed in Xamarin. The first thing we must do now is to understand each of the files that make up our solution of the object classifier.

[IMAGE]

Image repository files

We can see that it consists of a folder images and three python files

We will not have to modify anything of the code, it will only depend on each one the modification of the dataset to change the thematic of images to classify.

[IMAGE]

Image Dataset folder

Training

From the project directory, issue the following command to begin training:

python  detector_K_gridsearch.py –train_path images/train

If everything has been set up correctly, python will initialize the training. When training begins, it will look like this:

[IMAGE]

Image Reading all files of differents folders of artworks

Then after reading all the files, the program start to extract features descriptors for each image, in the image below I show you the example of Moulin Gallete Artwork, in which we see that the 21 images that we have of this picture now we have extracted their descriptors of characteristics:

[IMAGE]

Image Array of descriptors of Moulin Gallete Artwork

[IMAGE]

When we have finished generating the descriptors of each image, we begin to group the SIFT descriptors of each group of images to then perform the clustering and start creating our word notebook. Our codebook will increase with K, thanks to GridSearchCV we can do several tests with different values of K to see which is the most optimal value for our bag of visual words. We will start with K = 50, K = 150, K = 300, K = 500 words.

[IMAGE]

Image Codebook K=50

[IMAGE]

Image Codebook K=150

[IMAGE]

Image Codebook K=300

[IMAGE]

Image Codebook K=500

[IMAGE]

Image Comparison result of GridSearchCV for value “K” beetween SVM and AdaBoost

At the end of the process we will automatically save, thanks to GridSearchCV, the best SVM model in this case, which gives us a result of accuracy 0.91 and also save the model of our cluster. Two very necessary files for our API.

Python App on Azure Web App with Flask

I try to explain you several steps that you must do it to build succesfully your Python Web App with Flask on Azure:

These docs will help you to do your build on azure succesfully, I attach you an image of my Flask API solution structure. But first, we need to create in visual studio a project with Flask

[IMAGE]

Image Flask Web Project template

[IMAGE]

Image API Solution files

As you can see, we have several files but many of them we only need for the start-up of our solution in our azure web app. If you look, we have a folder called “pickle” where we have the model of our classifier made by the SVM algorithm and we also have a model created by Bag of Visual Words technique with K-Means Clustering.

[IMAGE]

Image Python environment in VS17

In addition, we can configure our solution in “Python Environment option” and add the version of python that we want from visual studio itself and add the libraries that are important for our project, such as:

[IMAGE]

Image requirements.txt file

Do not worry about creating the requirement.txt file because you have everything in the github repository. Once explained this you have a series of images where I show you the web app in Azure and the options that we must activate before uploading our solution to Azure.

[IMAGE]

Image Azure Python Web App with Flask

[IMAGE]

Image set up python environment version

[IMAGE]

Image Add pytho to extension to our web app

[IMAGE]

Image Location of python extension with Kudu

As I commented before to correctly upload our solution to azure we must create a series of files, like the runtime.txt. In that one you only need to write: python-3.4 for example or your version that you wanna use.

python-3.4

Another one file is runserver.py, is our index.html, you only need to change in your web.config in this line:

<add key="WSGI_HANDLER" value="runserver.app"/>

*Note: Remember that you need to set the file with .app extension.

<?xml version="1.0" encoding="utf-8"?>
<!--This template is configured to use Python 3.5 on Azure App Service. To use a different version of Python,or to use a hosting service other than Azure, replace the scriptProcessor path below with the path givento you by wfastcgi-enable or your provider. 
For Python 2.7 on Azure App Service, the path is "D:homePython27python.exe|D:homePython27wfastcgi.py" 
The WSGI_HANDLER variable should be an importable variable or function (if followed by '()') that returnsyour WSGI object. 
See https://aka.ms/PythonOnAppService for more information.
-->
<configuration>
  <appSettings>
    <add key="PYTHONPATH" value="D:homesitewwwroot"/>
    <!-- The handler here is specific to Bottle; other frameworks vary. -->
    <add key="WSGI_HANDLER" value="runserver.app"/>
    <add key="WSGI_LOG" value="D:homeLogFileswfastcgi.log"/>
  </appSettings>
  <system.webServer>
    <httpErrors errorMode="Detailed">
</httpErrors> 
   <handlers>
      <add name="PythonHandler" path="*" verb="*" modules="FastCgiModule"           scriptProcessor="D:homePython362x64python.exe|D:homePython362x64wfastcgi.py"           resourceType="Unspecified" requireAccess="Script"/>
    </handlers>
  </system.webServer>
</configuration>

Finally, you only need to add another web config like this image in a ProyectNameFolder/static/web.config and add lines of code like the image below:

<?xml version="1.0" encoding="utf-8"?>
<!--This template removes any existing handler so that the default handlers will be used for this directory.Only handlers added by the other web.config templates, or with the name set to PythonHandler, are removed. 
See https://aka.ms/PythonOnAppService for more information.
-->
<configuration>
  <system.webServer>
    <handlers>
      <remove name="PythonHandler"/>
    </handlers>
  </system.webServer>
</configuration>

When you have all these files correct in VS you need to publish and go to your azure web app and enter in development tools/Extension and install in your web app the env of Python that you would like to use.

Then go to kudu. In the second link that I attach you it explains you very well how to finish your deployment, you need to install in your web app via Kudu all requirements in your requirements.txt. You only need to locate your Python env that you were install previously in your web app, in my case in kudu the location is:

D:homePython362x64python.exe

In these lines of code, you can see that I start to import of libraries to my solution:

from flask import Flask, request, render_template, jsonify
from werkzeug import secure_filename
import logging
import sysimport cv2
import numpy as np
from sklearn.externals import joblib
import os
import pickle
import json

Here I start to declare variables, path to my models (SVM and cluster model) and console messages to see in execution:

app.logger.info('nn* * *nnOpenCV version is %s. should be at least 3.1.0, with nonfree installed.' % cv2.__version__) 
app = Flask(__name__)
ALLOWED_EXTENSIONS = ['png', 'jpg', 'jpeg'] 
APP_DIR = os.path.dirname(os.path.realpath(__file__)) 
MAX_PIXEL_DIM = 2000 
PICKLE_DIR = os.path.abspath(    os.path.join(    APP_DIR,    './pickles/')) 
LOG_PATH = os.path.abspath(os.path.join(APP_DIR,'../art_app.log')) 
logging.basicConfig(filename=LOG_PATH,level=logging.DEBUG,    format='%(asctime)s %(levelname)s: %(message)s [in %(pathname)s:%(lineno)d]') 
app.logger.setLevel(logging.DEBUG)

In these lines of code, I write the sentences necessary to get my models in their respectives path of my solution:

# TODO: make pickles for kmeans and single best classifier.
filename = os.path.join(PICKLE_DIR, 'svc\svc.pickle')
model_svc = pickle.load(open(filename, 'rb'), encoding='latin1') 
filename = os.path.join(PICKLE_DIR, 'cluster_model\cluster_model.pickle')
clust_model = pickle.load(open(filename, 'rb'), encoding='latin1') 
cluster_model = clust_model
clf = model_svc

Function to detect extension of input files of my API:

def allowed_file(filename):
    return '.' in filename and 
           filename.rsplit('.', 1)[1].lower() in ALLOWED_EXTENSIONS
 # upload image with curl using:
# curl -F 'file=@/home/' 'http://127.0.0.1:5000/'

Function to resize the input files:

def img_resize(img):
    height, width, _ = img.shape
    if height > width:
        # too tall
        resize_ratio = float(MAX_PIXEL_DIM)/height
    else:
        # too wide, or a square which is too big
        resize_ratio = float(MAX_PIXEL_DIM)/width
     dim = (int(resize_ratio*width), int(resize_ratio*height))
     resized = cv2.resize(img, dim, interpolation=cv2.INTER_AREA)
    app.logger.debug('resized to %s' % str(resized.shape))
     return resized

This is the more complex function cause here when I get the file, resize them we need to get features of the picture and do an inference to our cluster model, and finally get a result that we send to prediction function, the one in charge of making the inference to our SVM model:

def img_to_vect(img_np):

"""
Given an image path and a trained clustering model (eg KMeans),    generates a feature vector representing that image.    Useful for processing new images for a classifier prediction.    
"""    
# img = read_image(img_path)
    height, width, _ = img_np.shape
    app.logger.debug('Color image size - H:%i, W:%i' % (height, width))
    if height > MAX_PIXEL_DIM or width > MAX_PIXEL_DIM:
        img_np = img_resize(img_np)
    gray = cv2.cvtColor(img_np, cv2.COLOR_BGR2GRAY)
    sift = cv2.xfeatures2d.SIFT_create()
    kp, desc = sift.detectAndCompute(gray, None)
     clustered_desc = cluster_model.predict(desc)
    img_bow_hist = np.bincount(clustered_desc, minlength=cluster_model.n_clusters)
     # reshape to an array containing 1 array: array[[1,2,3]]
    # to make sklearn happy (it doesn't like 1d arrays as data!)
    return img_bow_hist.reshape(1,-1)

Finally, prediction function gets the result of our img_to_vect function and inference our svm model and return the result to our client:

def prediction(img_str):
    nparr = np.fromstring(img_str, np.uint8)
    img_np = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
    # convert to K-vector of codeword frequencies
    img_vect = img_to_vect(img_np)
    prediction = clf.predict(img_vect)
    return prediction[0]

And here’s our unique API function:

 @app.route('/predict', methods=['POST'])
def home():
    if request.method == 'POST':
        f = request.files['file'] 
       if f and allowed_file(f.filename):
            filename = secure_filename(f.filename)
            app.logger.debug('got file called %s' % filename)
            lb = prediction(f.read())
            print (lb)
            json_result = json.dumps(lb.decode("utf-8"))
            return json_result
        return 'Error. Something went wrong.'
    else:
        return render_template('img_upload.jnj')
  if __name__=="__main__":
    app.run(debug=True)

Demo with Postman (Azure Web App and local):

[IMAGE]

Create the mobile app on Azure Mobile App with Xamarin

La aplicación de Xamarin que he desarrollado no esta actualizada a .Net Standard, pero no creo que de problemas con la estrategia de PCL (Portable Class Library) . La solución es muy sencilla, unicamente contaremos con una única pagina de interfaz de usuario y añadiremos por consiguiente los controles necesarios de los botones, que serán elegir imagen del carrete, tomar foto y clasificar.

<?xml version="1.0" encoding="utf-8" ?>
<ContentPage xmlns="http://xamarin.com/schemas/2014/forms"             xmlns:x="http://schemas.microsoft.com/winfx/2009/xaml"             xmlns:local="clr-namespace:CustomVision"             x:Class="CustomVision.MainPage">
    <StackLayout Padding="20">
        <Image x:Name="Img" Source="ic_launcher.png" MinimumHeightRequest="100" MinimumWidthRequest="100">
</Image>
        <Button Text="Elegir imagen" Clicked="ElegirImage">
</Button>
        <Button Text="Sacar foto" Clicked="TomarFoto">
</Button>
        <Button Text="Clasificar" Clicked="Clasificar">
</Button>
        <Label x:Name="ResponseLabel">
</Label>
        <ProgressBar x:Name="Accuracy" HeightRequest="20"/>
    </StackLayout>
</ContentPage>

Ademas en esta misma pagina de xaml tendremos nuestro manejadores de eventos o funciones de cada botón:

using Newtonsoft.Json;
using Plugin.Media;
using Plugin.Media.Abstractions;
using System;
using System.IO;
using System.Linq;
using System.Net.Http;
using System.Net.Http.Headers;
using Xamarin.Forms;

namespace CustomVision
{
    public partial class MainPage : ContentPage
    {
        public const string ServiceApiUrl = "https://YOUR_WEB_APP_URL ";
        private MediaFile _foto = null;
        public MainPage()
        {
            InitializeComponent();
        }
         private async void ElegirImage(object sender, EventArgs e)
        {
            await CrossMedia.Current.Initialize(); 
            _foto = await Plugin.Media.CrossMedia.Current.PickPhotoAsync(new PickMediaOptions());
            Img.Source = FileImageSource.FromFile(_foto.Path);
        }
         private async void TomarFoto(object sender, EventArgs e)
        {
            await CrossMedia.Current.Initialize();
             if (!CrossMedia.Current.IsCameraAvailable || !CrossMedia.Current.IsTakePhotoSupported)
            {
                return;
            }
             var foto = await CrossMedia.Current.TakePhotoAsync(new StoreCameraMediaOptions()
            {
                PhotoSize = PhotoSize.Custom,                CustomPhotoSize = 10,                CompressionQuality = 92,                Name = "image.jpg"
            }); 
            _foto = foto;
             if (_foto == null)
                return;
             Img.Source = FileImageSource.FromFile(_foto.Path);
        }
         private async void Clasificar(object sender, EventArgs e)
        {
            using (Acr.UserDialogs.UserDialogs.Instance.Loading("Clasificando..."))
            {
                if (_foto == null) return;
                 var httpClient = new HttpClient();
                var url = ServiceApiUrl;
                var requestContent = new MultipartFormDataContent();
                var content = new StreamContent(_foto.GetStream());
                 content.Headers.ContentType =                    MediaTypeHeaderValue.Parse("image/jpg");
                 requestContent.Add(content, "file", "image.jpg");
                 var response = await httpClient.PostAsync(url, requestContent);
                 if (!response.IsSuccessStatusCode)
                {
                    Acr.UserDialogs.UserDialogs.Instance.Toast("Hubo un error en la deteccion...");
                    return;
                }
                 var json = await response.Content.ReadAsStringAsync();
                 var prediction = JsonConvert.DeserializeObject<string>(json); 
               if (prediction == null)
                {
                    Acr.UserDialogs.UserDialogs.Instance.Toast("Image no reconocida.");
                    return;
               }
                ResponseLabel.Text = $"{prediction}";
                //Accuracy.Progress = p.Probability;
            }
             Acr.UserDialogs.UserDialogs.Instance.Toast("Clasificacion terminada...");
        }
    }

Demo Xamarin Forms App

[IMAGE]

Image Test mobile app on iOS 11.2 (iPhone 6 Plus)

[IMAGE]

Image Test mobile app on Android 7.1 Nougat or 8.0 Oreo (Nexus -API 25-26)

Next Post!

Well, we have finished with the third post. Honestly, I thought that this post was going to be smaller but while I was writing it I saw that there was a lack of information to understand key concepts when carrying out projects or computer vision solutions. I hope these explanations and sample demonstrations have been very helpful and that you have shown them a very beautiful area such as computer vision, have the possibility to manipulate the images and extract information from them.

In the next post, we will dedicate to do a comparison between the use of TensorFlow and CNTK, the results obtained, the management and usability, points of interest and final conclusions about the research that has been carried out for the development of all the posts.

Kind regards,
Alexander González (@GlezGlez96)
Microsoft Student Partner

↧

What’s New in EDU- Intune for Education Special

April 24, 2018, 1:00 am

≫ Next: How to setup Global VNet peering in Azure

≪ Previous: [Post invitado] Part 3. Step by step – How to train an objects classifier understanding Computer Vision techniques with Python and OpenCV

As educators we are always looking for the simplest way to manage our classrooms and for some of us this includes our classroom devices. The introduction of Intune for Education means managing devices became a whole lot easier, creating a streamlined process which is accessible to all. In the latest special edition of 'What's New in Edu' Brad Anderson gives a walk through Intune for Education. Check out the video here!

For more information you can read the full blog post, inclusive of a focus upon Intune for Education here.

What’s New in EDU: The latest in Learning Tools, Office 365, Teams, Intune and more

Interested in getting started?

You can get started by setting up your free Office 365 Education, with a 90-day Intune for Education trial. Already have Office 365 Education? Start the free trial and sign in with your school account. Click here to find out more now.

↧

How to setup Global VNet peering in Azure

April 23, 2018, 8:57 pm

≫ Next: Schedule VM Downtime With Azure Automation And PowerShell

≪ Previous: What’s New in EDU- Intune for Education Special

Now that Global VNet peering has gone to GA I had a large university in California ask how to set this up.

What is Global VNet peering?

Global VNet peering in Azure is the ability to peer VNets or virtual networks across regions (see regional availability below).

Some benefits of Global VNet peering include:

Private Peering traffic stays on Azure network backbone
Low latency and high bandwidth VNet region to VNet region connectivity
No more VNet to VNet VPN configuration which means no VPN encryption, no gateways, no public internet necessary
No downtime setting up Global VNet peering with portal or ARM templates

What are some limitations of Global VNet peering?

Currently, one limitation as of April 2018, is that Global VNet peering is not available in all regions. It is slated to be available in the future for all regions:

The virtual networks can only exist in the following regions: West Central US (Wyoming), West US 2 (Washington), Central US (Iowa), US East 2 (Virginia), Canada Central (Toronto), Canada East (Quebec City), Southeast Asia (Singapore) Korea South (Buscan), South India (Chennai), Central India (Pune), West India (Mumbai), UK South (London), UK West (Cardiff), West Europe (Netherlands)

You cannot use Global VNet peering to communicate with VIPs of load balancers in another region. VIP communication requires source IP to be on the same VNet as the LB IP:

Resources in one virtual network cannot communicate with the IP address of an Azure internal load balancer in the peered virtual network. The load balancer and the resources that communicate with it must be in the same virtual network.

This is a big one as it bit us with a customer. You must not check ‘use remote gateways’ or ‘allow gateway transit’ like you select when setting up intra regional VNet peering:

You cannot use remote gateways or allow gateway transit. To use remote gateways or allow gateway transit, both virtual networks in the peering must exist in the same region.

VNet Global peerings are not transitive meaning downstream VNets in one region cannot talk with downstream VNets in another region.

The last one is to use Global VNet peering both VNets must be in the same subscription and use same Azure AD so cross subscription Global VNet peering is not currently supported.

How do I setup Global VNet peering?

It is fairly straightforward to setup Global VNet peering and I documented the steps below:

1) Setup VNets in each supported Global VNet supported region (April 2018 supported regions listed above). In my case, I created a VNet in US West 2 and West Central US

2) For testing create a VM in each region and associate it to a VNet you are going to use with Global VNet peering to test VM connectivity. The VMs cannot be associated with a downstream VNet as you recall since transitive downstream VNets are not supported with Global VNet peering.

3) Enable Global VNet peering:

3a) On one of the VNets you want to Global peer, go to Peerings and click Add

3b) Fill out the peering and select the VNet in the other region you want to Global Peer with – Important – don’t check any of the checkboxes!

3c) Setup peering in other direction by repeating step 3a and 3b but with the VNet peering in the other region – also don’t click any checkboxes and pick the other VNet in the other region you want to Global peer with:

4) Let it finish provisioning and then validate VNet Global peering is Connected on both VNets:

5) Test Global VNet connectivity with Telnet, PSPing or Ping Note: ensure firewalls allow ports you are telneting to, ICMP, etc.:

Test connectivity from one direction:

Then test connectivity from the other direction:

6) You are done! Congratulations Global VNet peering is complete!

↧

Schedule VM Downtime With Azure Automation And PowerShell

April 24, 2018, 7:00 am

≫ Next: Visual Studio 2017 roadmap now available

≪ Previous: How to setup Global VNet peering in Azure

Editor's note: The following post was written by Cloud and Datacenter Management MVP Timothy Warner as part of our Technical Tuesday series. Albert Duan of the MVP Award Blog Technical Committee served as the technical reviewer for this piece.

To save money, you want to shut down some of your Windows Server and Linux VMs running in Azure during non-working hours. For example, you probably need all the VMs in your production virtual network (VNet) up 24 hours per day, but you need the VMs in your test/dev VNet online only for particular time intervals.

As I am sure you know, human forgetfulness makes recurring tasks like this difficult to do manually. Today you will learn how to use Azure Automation runbooks to stop and start VMs on a schedule.

In this example, let's assume that I have four VMs in my user acceptance testing (UAT) VNet:

foo-uat-vm01
foo-uat-vm02
foo-uat-vm03
foo-uat-vm04

Now, let's get to work!

The TL/DR workflow

Before I get into the specifics, allow me to provide you with the "too long/didn't read," CliffsNotes procedure for using Azure Automation runbooks:

Define an Azure Automation account and Azure Active Directory (Azure AD) security principal
Add and test a PowerShell workflow-based Azure Automation runbook that performs the VM stop and start operations
Use the Azure Automation scheduler to execute the runbook on your schedule

Create an Azure Automation account

Have you ever used System Center Orchestrator (SCORCH)? Well, Azure Automation runbooks perform an analogous service in the Microsoft Azure cloud. To use these runbooks, you first need an Azure Automation account.

From the Automation Accounts blade in the Azure portal, click Add and fill in the following properties:

Friendly name
An active Azure subscription
Resource group
Azure region

As to whether to create an Azure Run As account, definitely choose Yes.

The Azure Run As account is an Azure-created security principal in your Azure AD tenant that provides the security context for your runbooks. The default name for this account is, imaginatively enough, AzureRunAsConnection. The account is granted the Contributor access role at the subscription level in Azure AD; I show you this in Figure 1.

Figure 1. Azure Run As Account details

Define the Azure Automation runbook

From your new Azure Automation accounts Settings menu, navigate to Process Automation > Runbooks, and then click Add a runbook. Complete the following properties in the Runbook blade:

Friendly name (required)
Runbook type (required)
Description (optional)

You have five choices for runbook type:

PowerShell script or function
Python v2
Graphical
PowerShell workflow
Graphical PowerShell workflow

In this example, we'll use PowerShell Workflow because the workflows give us the flexibility of PowerShell and the robustness and durability of the Windows Workflow foundation. Note that you can also import an existing script if you have one handy. In this example, however, we'll paste our code directly into the Azure portal.

Let me show you my workflow, and then I'll explain the relevant code lines.

Figure 2. Our VM reset workflow

NOTE: If you find my Reset-AzureRmVM workflow, then you're free to use it!

1: I chose the verb Reset because it's one of the approved verbs (reference: Get-Verb)
5-13: Some PowerShell scripters detest default parameter values. In this case it makes sense because your subscription ID and VM names are likely not to change. You can always override with different values as needed anyway
15: If you change the identity of your Run As connection, make sure to update this variable's value
22-37: Authenticate to your Azure subscription using the Run As connection identity and corresponding digital certificate
40: Here we process the states in which the workflow is run with or without the -All switch parameter. In any event, we enumerate the VM names passed into the workflow and generate an array
59: This is the logic to stop the VMs
69: Otherwise, if -Stop isn't used (implied -Start), we assume we need to start the VMs

Make sure to click Save from time to time to preserve your work. When you're ready to test, click Test pane to invoke the testing environment. Again, I'll show you an annotated screenshot and explain:

Figure 3. Runbook test environment

1: Note that Azure respects our default parameter values
2: Here I override the default to specify a different VM
3: Sadly, the testing interface doesn't present parameter enumerations in a drop-down list control
4: Start, stop, suspend, and resume your workflow
5. Check status. Note that the runbook actually runs; it's not a "what if" situation

You need to click Publish in the runbook editor to make the new runbook available for production execution.

Schedule the runbook

Schedules are shared objects in Azure Automation; thus, we need to create the schedule, and then bind it to our new PowerShell workflow. Specifically, we will need two schedules: one for VM shutdown, and another for VM startup.

From your Automation account's Settings menu, navigate to Shared Resources > Schedules, and click Add a schedule. Next, fill out the relevant properties:

Friendly name
Description
Start time/date
Time zone
Recurrence

In my lab, I created a schedule named VM-Shutdown that starts at 7:00 P.M. every day, and a schedule named VM-Startup that starts at 7:00 A.M. every day.

Now open your runbook Settings menu, navigate to Resources > Schedules, and then click Add a schedule. Select one of your two schedules, and optionally modify the run settings. Repeat the process, and your runbook properties should look similar to what I have in Figure 4.

Figure 4. Runbook schedules.

Next steps

You can do a lot of other cool configuration management tasks with your fancy new Azure automation account:

Inventory
Change Tracking
Update Management
PowerShell Desired State Configuration (DSC)

If you're thinking, "Hey! Those look like similar features to System Center Configuration Manager," then congratulations--you're correct. The Azure management solutions are what I like to call SCCM's cloud counterpart.

I hope you found this tutorial useful, and I wish you all the best.

Timothy Warner is a Microsoft Most Valuable Professional (MVP) in Cloud and Datacenter Management who is based in Nashville, TN. His professional specialties include Microsoft Azure, cross-platform PowerShell, and all things Windows Server-related. You can reach Tim via Twitter @TechTrainerTim, LinkedIn or his personal website, techtrainertim.com.

↧

Visual Studio 2017 roadmap now available

April 24, 2018, 9:00 am

≫ Next: Take the SQL Server Mac challenge

≪ Previous: Schedule VM Downtime With Azure Automation And PowerShell

With the release of Visual Studio 2017, we moved to a release schedule that delivers new features and fixes to you faster. With this faster iteration, we heard you would like more visibility into what’s coming. So, we’ve now published the Visual Studio Roadmap. The roadmap lists some of the more notable upcoming features and improvements but is not a complete list of all that is coming to Visual Studio.

When you look at the roadmap, you’ll see that we grouped items by quarter. Since every quarter includes several minor and servicing releases, the actual delivery of a feature could happen any time during the quarter. As we release these features and improvements, we’ll update the roadmap to indicate the release in which they are first available. The roadmap also includes suggestions from all of you, which are linked to the community feedback source.

Please let us know what you think about the roadmap in the comments, as our primary goal is to make it as useful as we can to you. We will be refreshing the document every quarter – with the next update coming around July. Stay tuned!

Thanks,

John

John Montgomery, Director of Program Management for Visual Studio
@JohnMont

John is responsible for product design and customer success for all of Visual Studio, C++, C#, VB, JavaScript, and .NET. John has been at Microsoft for 17 years, working in developer technologies the whole time.

↧

Take the SQL Server Mac challenge

April 24, 2018, 7:25 am

≫ Next: Unified Service Desk 3.3.0 is Released

≪ Previous: Visual Studio 2017 roadmap now available

When I graduated from college, one of the first computers I ever used was a MacIntosh. I loved the Mac, the user interface, and the overall footprint of that computer. I also started my career developing on UNIX systems with C++ and databases like Ingres. As I moved to other jobs, the PC was becoming very popular as was the Windows Operating System. When I joined Microsoft in 1993, I would embark on a 25 year journey working only on Windows laptops and Windows Server computers.

Last October we released SQL Server 2017 including support for Linux and Docker Containers. Since then, I have spent a great deal of my time talking to customers directly and at events about SQL Server on Linux. At one of these events in February in London (you may have heard of SQLBits), I was presenting on SQL Server on Linux and someone in the audience asked me this question. "Bob, I love what Microsoft is doing with Linux but I'm a MacBook user. I want to use my MacBook and run SQL Server on it". I thought for a second on this question and then came up with the idea of the "SQL Server Mac Challenge". I told the audience that with a reasonable internet connection, I could get any MacBook user up and running and connected to SQL Server with no Windows or Virtualization software in 5 minutes or less. The person in the audience took me up on my challenge and posted something on Twitter the next day that it worked!

Given my work lately on SQL Server on Linux, I asked my manager if I could get a MacBook so I could show off SQL Server to the MacBook user community. His answer was "of course!" Perhaps you are reading this and think this must be the Twilight Zone. Does this guy still work for Microsoft?

So here in this blog post, I will show you my journey in taking the SQL Server Mac Challenge. I'm happy to tell you it took only 4 minutes on my MacBook Pro.

First, you need to download Docker for Mac as seen on this screenshot of the website to download (https://store.docker.com/editions/community/docker-ce-desktop-mac)

The download is not too large and didn't take long on my internet connection. At the bottom right corner on my MacBook is an icon for downloads. I selected this to extract the downloaded image

When the image for Docker for Mac is extracted, a new window pops up so I can install it as an Application. I just used my mouse to drag the Docker icon on the Applications icon on this screen

When this completed, I selected the Launchpad application on the Dock and it shows the Docker application installed

I double-clicked the Docker Application. Now I see a new icon at the top status bar on my MacBook showing Docker is starting up

While Docker is starting up, I now decided to multi-task and download our new open-source, cross-platform tool called SQL Operations Studio. You can download the Mac version at https://docs.microsoft.com/en-us/sql/sql-operations-studio/download?view=sql-server-2017.

While SQL Operations Studio is downloading, I can pull the docker image for SQL Server. The steps for pulling docker images for SQL Server can be found at https://docs.microsoft.com/en-us/sql/linux/quickstart-install-connect-docker?view=sql-server-linux-2017. Since the terminal for MacBook is a bash shell, I just ran this command in the terminal

docker pull microsoft/mssql-server-linux:2017-latest

Here is my terminal screen showing the docker pull in action

While the docker pull is now downloading the docker image for SQL Server, I went back to extract the SQL Operations Studio download.

If you look closely at this screen, the docker pull has completed so now I can start up a docker container with these commands:

docker run -e 'ACCEPT_EULA=Y' -e 'MSSQL_SA_PASSWORD=Sql2017isfast' -p 1401:1433 --name sql1 -d microsoft/mssql-server-linux:2017-latest

Notice the -p parameter which maps port 1433 to 1401. When I connect to this SQL Server I will use port 1401.

The result of this command looks like this. When this completes SQL Server is now up and running in a Docker container. Docker on Mac is a native Mac application.

Note: There is an issue with Docker on Mac and SQL Server using Host Volume Mapping. You can use data volume containers instead. See this note at https://docs.microsoft.com/en-us/sql/linux/sql-server-linux-configure-docker?view=sql-server-linux-2017#persist and the github issue at https://github.com/Microsoft/mssql-docker/issues/12.

SQL Operations Studio has extracted so I'll select that to install it as an application and launch it. Again, SQL Operations Studio for Mac is a native Mac Application.

When SQL Operations Studio launches it pops-up a window for me to supply a server to connect to. I'll use the local IP address, port 1401, and the sa password I supplied when running the container

When I connect, SQL Operations Studio shows an Object Explorer and dashboard

I run a query by right-clicking the Server and select New Query

And now I'll run SELECT @@version to prove I can run a query

There you have it. I did this in less than 5 minutes!

So calling all MacBook users. Take the SQL Server Mac Challenge!

Bob Ward

Microsoft

↧

Unified Service Desk 3.3.0 is Released

April 24, 2018, 8:15 am

≫ Next: Announcing a single C++ library manager for Linux, macOS and Windows: Vcpkg

≪ Previous: Take the SQL Server Mac challenge

Continuing towards our goal towards bringing the best and the brightest of Dynamics 365 experiences to our users, enabling our developer community to build and deploy robust solutions, and providing our customers with even more reliable and compliant Unified Service Desk, we have released the latest version 3.3.0.

Download Unified Service Desk 3.3.0

The highlights of this release are:

Host Unified Interface Apps in Unified Service Desk (Preview)

With the release of Dynamics 365 (online), version 9.0, we've introduced a new user experience - Unified Interface - which uses responsive web design principles to provide an optimal viewing and interaction experience for any screen size, device, or orientation. With this release, Unified Service Desk supports the apps built using Unified Interface framework which is available in latest Dynamics 365 (online), version 9.0. As of now this is a preview capabilities, that partners and customers can use to build new solutions with Unified Interface. The release contains the client side changes that allows working with Unified Interface apps, solution changes that allows creating the requisite configurations, and a sample package that showcases the capabilities.

Analyze best practices in Unified Service Desk

Best practice analyzer (BPA) is a developer tool that helps identify deployment and configuration issues in Unified Service Desk deployment environments. It has guidelines about System Configurations, Unified Service Desk and Internet Explorer settings, and Unified Service Desk configurations in Dynamics 365. Consider these guidelines as our recommended way to use Unified Service Desk and serve your customers.

Analyze best practices in Unified Service Desk
Download and install Best Practices Analyzer

Unified Service Desk Improvement Program

Improvement program data lets Unified Service Desk send application-specific information like product usage, health, and performance data to Microsoft. We use the information that we collect from the program to analyze and improve the service and product experience for our customers. From 3.3.0, the improvement program is by default, enabled for the Unified Service Desk deployments against Dynamics CRM online.

We have also built a feedback mechanism which you can integrate in your solution, allowing agents to provide verbatim feedback and usage sentiment to Microsoft.

Help improve Unified Service Desk

Reliability Improvements

There are multiple improvements in terms of overall reliability of Unified Service Desk, particularly around management of Internet explorer process and an immersive recovery experience when internet explorer process encounters a fault.

Recover Internet Explorer process instance

Comply with the General Data Protection Regulation

The General Data Protection Regulation (GDPR) imposes new rules on organizations in the European Union (EU) and those that offer goods and services to people in the EU, or those that collect and analyze data tied to EU residents, regardless of where they are located. At the links below, we have outlined the guidance for Unified Service Desk customers in order to comply with the GDP regulations.

Comply with General Data Protection Regulation (GDPR)

Call to Action:

Customers and partners are encouraged to validate the latest release in their environments and plan for an upgrade. Customers and partners, can use the preview feature to build their solutions with the Unified Interface apps in Dynamics 365 (online), version 9.0 organizations. New and existing customers, can use the best practice analyzer tool to validate their solutions and deployment for adherence to the best practices for best results.

For more details on what is new in the Unified Service Desk 3.3.0, see:

What's new for Administrators
What's new for Developers

↧

Announcing a single C++ library manager for Linux, macOS and Windows: Vcpkg

April 24, 2018, 10:24 am

≫ Next: Introducing Best Practices Analyzer for Unified Service Desk

≪ Previous: Unified Service Desk 3.3.0 is Released

At Microsoft, the core of our vision is “Any Developer, Any App, Any Platform” and we are committed to bringing you the most productive development tools and services to build your apps across all platforms. With this in mind, we are thrilled to announce today the availability of vcpkg on Linux and MacOS. This gives you immediate access to the vcpkg catalog of C++ libraries on two new platforms, with the same simple steps you are familiar with on Windows and UWP today.

Vcpkg has come a long way since its launch at CppCon 2016. Starting from only 20 libraries, we have seen an incredible growth in the last 19 months with over 900 libraries and features now available. All credit goes to the invaluable contributions from our amazing community.

In the feedback you gave us so far, Linux and Mac support was the most requested feature by far. So we are excited today to see vcpkg reach an even wider community and facilitate cross-platform access to more C++ libraries. We invite you today to try vcpkg whether you target Windows, Linux or MacOS.

To learn more about using vcpkg on Windows, read our previous post on how to get started with vcpkg on Windows.

Using vcpkg on Linux and Mac

The Vcpkg tool is now compatible with Linux, Mac and other POSIX systems. This was made possible only through the contributions of several fantastic community members

At the time of writing this blogpost, over 350 libraries are available for Linux and Mac and we expect that number to grow quickly. We currently test daily on Ubuntu-LTS 16.04/18.04 and we had success on Arch, Fedora, FreeBSD.

Getting started:

1) Clone the vcpkg repo: git clone https://github.com/Microsoft/vcpkg

2) Bootstrap vcpkg: ./bootstrap-vcpkg.sh

3) Once vcpkg is built, you can build any library using the following syntax:

vcpkg install sdl2

This will install sdl2:x64-linux (x64 static is the default and only option available on Linux)

The result (.h, .lib) is stored in the same folder tree, reference this folder in your build system configuration.

4) Using the generated library

If you use CMake as build system, then you should use CMAKE_TOOLCHAIN_FILE to make libraries available with `find_package()`. E.g.: cmake .. "-DCMAKE_TOOLCHAIN_FILE=vcpkg/scripts/buildsystems/vcpkg.cmake"

You should reference the vcpkg folder containing the headers (vcpkginstalledx64-linuxinclude) and also the one containing the .lib (vcpkginstalledx64-linuxlib) to be able to build your project using the generated libraries.

Using vcpkg to target Linux from Windows via WSL

As WSL is a Linux system, we’ll use WSL as we did with Linux. Once configured correctly you will be able to produce Linux binaries from your Windows machine as if they had been generated from a Linux box. Follow the same instruction as for installing on Linux. See how to Setup WSL on Windows 10, and configure it with the Visual Studio extension for Linux.

As shown in the screenshot above, the vcpkg directory could be shared between Windows and WSL. In this example sdl2 and sqlite3 were built from WSL (binaries for Linux); sqlite3 was built also for Windows (Windows dll).

In closing

Install vcpkg on Linux or Mac and try it in your cross-platform projects and let us know how we can make it better and what is your cross-platform usage scenario.

As always, your feedback and comments really matter to us, open an issue on GitHub or reach out to us at vcpkg@microsoft.com for any comments and suggestions., complettaking a moment to complete oure surveys

↧

Introducing Best Practices Analyzer for Unified Service Desk

April 24, 2018, 9:43 am

≫ Next: Rotate the expired or nearly expired certificates on your downloadable VHD

≪ Previous: Announcing a single C++ library manager for Linux, macOS and Windows: Vcpkg

Empowering our customers to create successful customer service solutions is among our major goals. Often, customizers rely on solutions, workarounds proposed by the Unified Service Desk community and are in doubt if their approach complies with the best standards. So, we decided to make it easier to validate Unified Service Desk solutions against the best standards. We’re very excited to introduce ‘Best Practices Analyzer’ for Unified Service Desk.

If there’s more than one way to solving a problem, the ones that have worked out very well with many of our customers are what we are calling as best practices.

Best Practices Analyzer helps customizers by finding improper configurations in their USD solutions on three fronts – Unified Service Desk Configurations, Internet Explorer Settings, and System Configurations.

Here are a few points that will help you ramp up with the tool:

How to generate a Report

Go to Settings and click Best Practices Analyzer

How to read the Report

Report Snapshot

Shows Computer Name, Analysis Time and Score. This information is particularly helpful in analyzing the past reports or when reports are shared with stakeholders such as IT admin, CRM admin etc.

Rules Snapshot

Shows the list of rules sorted by result in the following order Error, Warning and Passed. Admins should expand each of these rules to understand the problem and fix the issues according to the prescribed Mitigation

When to leverage the tool

Identify gaps in existing solutions

If a contact center agent reports performance or crash related issues, we advise CRM admins to run the tool on the current Unified Service Desk solution and fix any issues
We advise CRM admins to run the tool on the current solutions to find any unreported issues and potentially improve agents’ experience

Identify gaps when enhancing solutions / building new solutions

When enhancing Unified Service Desk solutions, we advise CRM admins to run the tool on the new solution and fix any issues before rollout

How to share these reports

Every time a user runs a report, the report is saved to the Downloads location as a HTML file. [These files can be accessed via browser and doesn’t need USD client]

What Next: This list is just a start. We’d love to hear your feedback and improve upon the rules list based on the feedback.

Analyze best practices in Unified Service Desk

↧

Rotate the expired or nearly expired certificates on your downloadable VHD

April 24, 2018, 11:13 am

≫ Next: SQL Server Reporting Services failed to start (Reporting Services啟動失敗)

≪ Previous: Introducing Best Practices Analyzer for Unified Service Desk

To rotate certificates on machines created from the Dynamics 365 for Finance and Operations downloadable VHD, complete the following steps for each certificate. Sample PowerShell scripts are provided where applicable.
1. Identify which certificates will expire in the next two months.
Get-ChildItem -path Cert:LocalMachineMy | Where {$_.NotAfter -lt $(get-date).AddMonths(2)} | Format-Table Subject, Thumbprint, NotAfter | Sort NotAfter
2. Record the thumbprint of the certificate that needs to be replaced. You will need this in the next step.
3. Obtain a new certificate for the expired certificate.
Set-Location -Path "cert:LocalMachineMy"
$OldCert = (Get-ChildItem -Path )
New-SelfSignedCertificate -CloneCert $OldCert
Note: The thumbprint must be entered without spaces. For more information and an example, see the New-SelfSignedCertificate Powershell documentation.
4. Find and replace all references to the thumbprint of the expired certificate with the thumbprint of the newly created certificate in the configuration files below. These files can be found under C:AOSServicewebroot.
web.config
wif.config
wif.services.config
5. Restart the IIS.
iisreset

↧

SQL Server Reporting Services failed to start (Reporting Services啟動失敗)

April 24, 2018, 8:13 am

≫ Next: Guideline for SQL Server configuration, installation and database creation

≪ Previous: Rotate the expired or nearly expired certificates on your downloadable VHD

SQL Server Reporting Services failed to start

Reporting Services啟動失敗

Problem 問題:

The SQL Server Reporting Services (MSSQLSERVER) service failed to start due to the following error:

The service did not respond to the start or control request in a timely fashion.

Resolution 解決方法:

有三種解決方法，以下是我常用的一種

To increase the default service time-out, follow these steps:

1. Click Start, click Run, type regedit in the Open box, and then click OK.
2. Locate and then select the following registry subkey:
3. HKEY_LOCAL_MACHINESYSTEMCurrentControlSetControl
4. Right-click Control, point to New, and then click DWORD.
5. In the New Value box, type ServicesPipeTimeout, and then press Enter.
6. Right-click ServicesPipeTimeout, and then click Modify.
7. Click Decimal, type the number of milliseconds that you want to wait until the service times out, and then click OK.
8. For example, to wait 60 seconds before the service times out, type 60000.
9. On the File menu, click Exit, and then restart the computer.

其他解決方法請參考以下連結文件

You cannot start SQL Server Reporting Services after you apply the update that is discussed in KB 2677070

https://support.microsoft.com/en-us/help/2745448/you-cannot-start-sql-server-reporting-services-after-you-apply-the-upd

Reporting Services service doesn’t start after the installation of MS12-070 security patch

https://blogs.msdn.microsoft.com/mariae/2012/11/12/reporting-services-service-doesnt-start-after-the-installation-of-ms12-070-security-patch/

↧

Guideline for SQL Server configuration, installation and database creation

April 24, 2018, 9:41 am

≫ Next: Blockchain in a nutshell

≪ Previous: SQL Server Reporting Services failed to start (Reporting Services啟動失敗)

Guideline for SQL Server configuration, installation and database creation

1.考慮將Windows電源選項(Power Options)調整為高性能(High Performance)

電源選項預設為平衡，系統會動態調整CPU性能，負載大的時候也會自動提高CPU性能，依照KB的描述若有遇到性能問題，首先考慮更新BIOS與Windows hotfix，另一個選擇，則是直接將電源選項改為高性能(High Performance)，但此時當系統平常使用時，會增加不必要耗電量。

Slow Performance on Windows Server when using the “Balanced” Power Plan

https://support.microsoft.com/en-us/help/2207548/slow-performance-on-windows-server-when-using-the-balanced-power-plan

To change a power plan:
1. Click on Start and then Control Panel.
2. From the list of displayed item under Control Panel click on Power Options, which takes you to Select a power plan page. If you do not see Power Options, type the word 'power' in the Search Control Panel box and then select Choose a power plan.
3. By default, the option to change power plans is disabled. To enable this, click the Change settings that are currently unavailable link.
4. Choose the High Performance option
5. Close the Power Option window.

2.Disk partitions for SQL Server

格式化磁碟機時Allocation Unit Size選擇64KB

Recommendations and Guidelines on configuring disk partitions for SQL Server

https://support.microsoft.com/zh-tw/kb/2023571

Partition alignment defaults

Windows Server 2003 and Earlier

by default are not aligned. Partition alignment must be explicitly performed.

default alignment is 32,256 bytes

Windows 2008

New partitions on Windows Server 2008 are likely to be aligned.

Default alignment is 1024 KB (1,048,576 bytes)

This value works well with commonly used stripe unit sizes of 64 KB, 128 KB and 256 KB as well as the less frequently used values of 512 KB and 1024 KB.

檢查D磁碟目前Allocation Unit Size設定，Bytes Per Cluster就是Allocation Unit Size
D:>fsutil fsinfo ntfsinfo d:

NTFS Volume Serial Number :       0xa2060a7f060a54a
Version : 3.1
Number Sectors : 0x00000000043c3f5f
Total Clusters :                  0x000000000008787e
Free Clusters : 0x000000000008746e
Total Reserved : 0x0000000000000000
Bytes Per Sector : 512
Bytes Per Cluster :            65536
Bytes Per FileRecord Segment    : 1024
Clusters Per FileRecord Segment : 0
Mft Valid Data Length :       0x0000000000010000
Mft Start Lcn : 0x000000000000c000
Mft2 Start Lcn :                  0x0000000000043c3f
Mft Zone Start :                  0x000000000000c000
Mft Zone End   : 0x000000000001cf20

An appropriate value for most installations should be 65,536 bytes (that is, 64 KB) for partitions on which SQL Server data or log files reside. In many cases, this is the same size for Analysis Services data or log files, but there are times where 32 KB provides better performance. To determine the right size, you will need to do performance testing with your workload comparing the two different block sizes.

以下是範例用命令列來format並指定allocation unit(cluster size)

Here is an example in which the F: drive is created on disk 3, aligned with an offset of 1,024 KB, and formatted with a file allocation unit (cluster) size of 64 KB.

C:>diskpart

Microsoft DiskPart version 6.0.6001
Copyright (C) 1999-2007 Microsoft Corporation.
On computer: ASPIRINGGEEK
DISKPART> list disk
Disk ### Status Size Free Dyn GPT
-------- ---------- ------- ------- --- ---
Disk 0 Online       186 GB 0 B
Disk 1 Online       100 GB 0 B
Disk 2 Online       120 GB 0 B
Disk 3 Online       150 GB 150 GB
DISKPART> select disk 3
Disk 3 is now the selected disk.
DISKPART> create partition primary align=1024
DiskPart succeeded in creating the specified partition.
DISKPART> assign letter=F
DiskPart successfully assigned the drive letter or mount point.
DISKPART> format fs=ntfs unit=64K label="MyFastDisk" nowait

Reference:

Disk Partition Alignment Best Practices for SQL Server
https://technet.microsoft.com/en-us/library/dd758814(v=sql.100).aspx

Disk Partition Alignment: It Still Matters–DPA for Windows Server 2012, SQL Server 2012, and SQL Server 2014

https://blogs.msdn.microsoft.com/jimmymay/2014/03/14/disk-partition-alignment-it-still-matters-dpa-for-windows-server-2012-sql-server-2012-and-sql-server-2014/

3.Max Server Memory(設定SQL Server Max Server Memory)

SQL Server Database Engine專用的主機基本原則

Windows 2008以上，最少保留2GB，其他設定為max server memory

Windows 2003以上，最少保留1GB，其他設定為max server memory

PS.若還有其他SQL Server元件(SSAS, SSRS...等)或其他服務或程式(防毒,備份...等)，則視狀況減少max server memory留給其他服務使用。

更精確的計算方式如下:

Use max_server_memory to guarantee the OS does not experience detrimental memory pressure. To set max server memory configuration, monitor overall consumption of the SQL Server process in order to determine memory requirements. To be more accurate with these calculations for a single instance:

From the total OS memory, reserve 1GB-4GB to the OS itself.
Then subtract the equivalent of potential SQL Server memory allocations outside the max server memory control, which is comprised of stack size ¹ * calculated max worker threads ² + -g startup parameter ³ (or 256MB by default if -g is not set). What remains should be the max_server_memory setting for a single instance setup.

¹ Refer to the Memory Management Architecture guide for information on thread stack sizes per architecture.

² Refer to the documentation page on how to Configure the max worker threads Server Configuration Option, for information on the calculated default worker threads for a given number of affinitized CPUs in the current host.

³ Refer to the documentation page on Database Engine Service Startup Options for information on the -g startup parameter.

Reference:

Server Memory Server Configuration Options
https://docs.microsoft.com/en-us/sql/database-engine/configure-windows/server-memory-server-configuration-options?view=sql-server-2017

4.SQL Server startup account (SQL Server啟動帳戶)

設定SQL Server Startup Account，例如 CONTOSOsqlservice

5.Lock Page in Memory (鎖定記憶體分頁)

Local Security Policy>Local Policy>User Rights Assignment> Lock pages in memory

To enable Lock Pages in Memory

To enable the lock pages in memory option:

On the Start menu, click Run. In the Open box, type gpedit.msc.The Group Policy dialog box opens.
On the Group Policy console, expand Computer Configuration, and then expand Windows Settings.
Expand Security Settings, and then expand Local Policies.
Select the User Rights Assignment folder.The policies will be displayed in the details pane.
In the pane, double-click Lock pages in memory.
In the Local Security Policy Setting dialog box, add the account with privileges to run sqlservr.exe (the SQL Server startup account).加入SQL Server Startup account，例如 CONTOSOsqlservice

如果從本機安全性原則，此選項是灰階，表示此設定由AD上面的GPO控制，請從GPO設定

Lock Pages in Memory (LPIM)

This Windows policy determines which accounts can use a process to keep data in physical memory, preventing the system from paging the data to virtual memory on disk. Locking pages in memory may keep the server responsive when paging memory to disk occurs. The Lock Pages in Memory option is set to ON in instances of SQL Server Standard edition and higher when the account with privileges to run sqlservr.exe has been granted the Windows Lock Pages in Memory (LPIM) user right.

To disable the Lock Pages In Memory option for SQL Server, remove the Lock Pages in Memory user right for the account with privileges to run sqlservr.exe (the SQL Server startup account) startup account.

Setting this option does not affect SQL Server dynamic memory management, allowing it to expand or shrink at the request of other memory clerks. When using the Lock Pages in Memory user right it is recommended to set an upper limit for max server memory as detailed above.

Reference:

Server Memory Server Configuration Options
https://docs.microsoft.com/en-us/sql/database-engine/configure-windows/server-memory-server-configuration-options?view=sql-server-2017

6.Tempdb file

(1) data file數量

SQL使用的Logical CPU數量	設定tempdb data files數量
若少於等於8	設定等於邏輯CPU數量
若大於8	設定8個data file 然後觀察如果持續有發現contention的話，一次加4個，來增加數量，最大等於邏輯CPU數。

觀察contention(例如 metadata contention (waitresource = 2:1:1 or 2:1:3))，再來決定是否還要增加數量

(2) Pre-size (資料檔大小)

如果8個，先每一個大小先設定為1024MB，作為初始值。

如果4個，先每一個大小先設定為1024MB，作為初始值。

如果2個，先每一個大小先設定為2048MB，作為初始值。

PS.監控觀察之後再來調整。

(3) Autogrow(自動成長設定)

先設定為每次成長200MB開始

設定適當的成長大小 general guidelines

tempdb file size	FILEGROWTH increment
0 to 100 MB	10 MB
100 to 200 MB	20 MB
200 MB or more	10%* (這個值，請參考下面的說明)

* You may have to adjust this percentage based on the speed of the I/O subsystem on which the tempdb files are located. To avoid potential latch time-outs, we recommend limiting the autogrow operation to approximately two minutes. For example, if the I/O subsystem can initialize a file at 50 MB per second, the FILEGROWTH increment should be set to a maximum of 6 GB, regardless of the tempdb file size. If possible, use instant database file initialization to improve the performance of autogrow operations

7.設定Instant data file initialization可改善自動成長時的效率

Database File Initialization

https://msdn.microsoft.com/en-us/library/ms175935(v=sql.105).aspx

設定Instant data file initialization，改善自動成長時的效率

(初始化檔案或成長檔案時不填入0)

Database Instant File Initialization

https://msdn.microsoft.com/en-us/library/ms175935.aspx

Data and log files are initialized to overwrite any existing data left on the disk from previously deleted files. Data and log files are first initialized by filling the files with zeros when you perform one of the following operations:

Create a database.
Add files, log or data, to an existing database.
Increase the size of an existing file (including autogrow operations).
Restore a database or filegroup.

File initialization causes these operations to take longer. However, when data is written to the files for the first time, the operating system does not have to fill the files with zeros.

Database File Initialization

https://msdn.microsoft.com/en-us/library/ms175935(v=sql.105).aspx

To grant an account the Perform volume maintenance tasks permission:

(1) On the computer where the backup file will be created, open the Local Security Policy application (secpol.msc).

(2) In the left pane, expand Local Policies, and then click User Rights Assignment.

(3) In the right pane, double-click Perform volume maintenance tasks.

(4) Click Add User or Group and add any user accounts that are used for backups.

(5) Click Apply, and then close all Local Security Policy dialog boxes.

8.User Database file (使用者資料庫檔案)

(1) 基本設定原則

Create DataFG1 for Data (set default)，不要使用Primary File Group
Create IndexFG1 for Index，不要使用Primary File Group，建立Non-Clustered Index時需指定此File Group

(2) 如果可以，將交易紀錄檔設定到獨立磁碟機

(3) 如果可以，設定多個資料檔分散到多個獨立磁碟機

(4) Pre-size (預先設定資料檔大小)

如果是小型資料庫，且只有一個檔案，預估成長大小並設定上去。例如: 20 GB。
如果是大型資料庫，設定多個檔案，預估成長大小，然後除檔案數量，就是每個檔案的大小。例如: 4個50GB的資料檔。

(5) Autogrow(自動成長設定)，可先設定為每次成長200MB開始

↧