在之前的文章中介紹了如何對(duì)關(guān)系型數(shù)據(jù)數(shù)據(jù)通過(guò)auto-sharding進(jìn)行分布式數(shù)據(jù)存儲(chǔ),今天介紹如何對(duì)物理文件(小文件,基本小于100K)進(jìn)行分布式存儲(chǔ)。
接著看一下要配置的測(cè)試環(huán)境(與前一篇中類(lèi)似):
模擬2個(gè)shard服務(wù)和一個(gè)config服務(wù), 均運(yùn)行在10.0.4.85機(jī)器上,只是端口不同:
Shard1:27020
Shard2:27021
Config:27022
Mongos啟動(dòng)時(shí)默認(rèn)使用的27017端口
在C,D,E磁盤(pán)下分別建立如下文件夾:
mongodb/bin
mongodb/db
然后用CMDming令行依次打開(kāi)相應(yīng)文件夾下的mongd文件:
c:/mongodb/bin/mongod --dbpath c:/mongodb/db/ --port 27020
d:/mongodb/bin/mongod --dbpath d:/mongodb/db/ --port 27021
e:/mongodb/bin/mongod --configsvr --dbpath e:/mongodb/db/ --port 27022 (注:config配置服務(wù)器)
啟動(dòng)mongos時(shí),默認(rèn)開(kāi)啟了27017端口
e:/mongodb/bin/mongos --configdb 10.0.4.85:27022
然后打開(kāi)mongo:
E:/mongodb/bin>mongo 回車(chē) (有時(shí)加端口會(huì)造成下面的addshardming令出問(wèn)題)
> use admin
switched to db admin
> db.runCommand( { addshard : "10.0.4.85:27020", allowLocal : 1, maxSize:2 , minKey:1, maxKey:10} )
--添加sharding,maxsize單位是M,此處設(shè)置比較小的數(shù)值只為演示sharding效果
{ "shardAdded" : "shard0000", "ok" : 1 }
> db.runCommand( { addshard : "10.0.4.85:27021", allowLocal : 1, minKey:1000} )
{ "shardAdded" : "shard0001", "ok" : 1 }
注:如果要移除sharding,可用下面寫(xiě)法
db.runCommand( { removeshard : "localhost:10000" } );
> db.runCommand({listshards:1}); --查看shard節(jié)點(diǎn)列表
> config = connect("10.0.4.85:27022")
> config = config.getSisterDB("config")
> dnt_mongodb=db.getSisterDB("dnt_mongodb");
dnt_mongodb
> db.runCommand({enablesharding:"dnt_mongodb"})
{ "ok" : 1 }
> db.printShardingStatus()
--- Sharding Status ---
sharding version: { "_id" : 1, "version" : 3 }
shards:
{
"_id" : "shard0000",
"host" : "10.0.4.85:27020",
"maxSize" : NumberLong( 2 )
}
{ "_id" : "shard0001", "host" : "10.0.4.85:27021" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "dnt_mongodb", "partitioned" : true, "primary" : "shard0001" }
> db.runCommand( { shardcollection : "dnt_mongodb.attach_gfstream.chunks", key : { files_id : 1 } } ) --此處與之前的數(shù)據(jù)存儲(chǔ)方式有些不同,目前shard似乎僅支持files_id
{ "collectionsharded" : "dnt_mongodb.attach_gfstream.chunks", "ok" : 1 }
注:運(yùn)行上面ming令之前需要設(shè)置files_id為唯一索引[unique index]。
創(chuàng)建完sharding和設(shè)置相應(yīng)信息后,我們加載一下測(cè)試數(shù)據(jù),我用下面代碼來(lái)讀取要本地文件,然后批量向mongodb中添加(通過(guò)循環(huán)修改文件名來(lái)添加相同大小的文件)。
/// <summary>
/// 上傳文件到mongodb
/// </summary>
/// <param name="uploadDir">要上傳文件所在路徑</param>
/// <param name="fileName">要上傳的文件名</param>
/// <returns></returns>
public bool UploadFile(string uploadDir, string fileName)
{
for (int i = 1; i < 10000; i++)
{
try
{
Mongo mongo = mongoDB;
mongo.Connect();
IMongoDatabase DB = mongo["dnt_mongodb"];
using (FileStream fileStream = new FileStream(uploadDir + fileName, FileMode.Open))
{
int nFileLen = (int)fileStream.Length;
byte[] myData = new Byte[nFileLen];
fileStream.Read(myData, 0, nFileLen);
GridFile fs = new GridFile(DB, "attach_gfstream");
using (GridFileStream gfs = fs.Create(fileName + i))
{
gfs.Write(myData, 0, nFileLen);
}
}
mongo.Disconnect();
}
catch { }
}
return true;
}
在批量添加約10000次(約10000個(gè)文件)之后,mongodb開(kāi)始把sharding出來(lái)的chunk從shard0000分布到shard0001上,我們可以用下面指令來(lái)進(jìn)行驗(yàn)證:
> db.printShardingStatus()
--- Sharding Status ---
sharding version: { "_id" : 1, "version" : 3 }
shards:
{
"_id" : "shard0000",
"host" : "10.0.4.85:27020",
"maxSize" : NumberLong( 2 )
}
{ "_id" : "shard0001", "host" : "10.0.4.85:27021" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "dnt_mongodb", "partitioned" : true, "primary" : "shard0000" }
dnt_mongodb.attach_gfstream.chunks chunks:
{ "files_id" : { $minKey : 1 } } -->> { "files_id" : ObjectId("4c85fd02145a9b1534010d89") } on : shard0001 { "t" : 2000, "i" : 0 }
{ "files_id" : ObjectId("4c85fd02145a9b1534010d89") } -->> { "files_id" : ObjectId("4c85fdec145a9b0b340005a7") } on : shard0000 { "t" :3000, "i" : 1 }
{ "files_id" : ObjectId("4c85fdec145a9b0b340005a7") } -->> { "files_id" : ObjectId("4c85fe08145a9b0b34000aaf") } on : shard0001 { "t" :3000, "i" : 4 }
{ "files_id" : ObjectId("4c85fe08145a9b0b34000aaf") } -->> { "files_id" : ObjectId("4c85fe27145a9b0b34000fb7") } on : shard0001 { "t" :4000, "i" : 1 }
{ "files_id" : ObjectId("4c85fe27145a9b0b34000fb7") } -->> { "files_id" : ObjectId("4c85fe43145a9b0b340014bf") } on : shard0000 { "t" :4000, "i" : 7 }
{ "files_id" : ObjectId("4c85fe43145a9b0b340014bf") } -->> { "files_id" : ObjectId("4c85fe61145a9b0b340019c7") } on : shard0000 { "t" :4000, "i" : 8 }
{ "files_id" : ObjectId("4c85fe61145a9b0b340019c7") } -->> { "files_id" : ObjectId("4c85fe7b145a9b0b34001ecf") } on : shard0000 { "t" :5000, "i" : 1 }
{ "files_id" : ObjectId("4c85fe7b145a9b0b34001ecf") } -->> { "files_id" : ObjectId("4c85fe9a145a9b0b340023d7") } on : shard0001 { "t" :5000, "i" : 4 }
{ "files_id" : ObjectId("4c85fe9a145a9b0b340023d7") } -->> { "files_id" : ObjectId("4c85feb7145a9b0b340028df") } on : shard0001 { "t" :6000, "i" : 1 }
{ "files_id" : ObjectId("4c85feb7145a9b0b340028df") } -->> { "files_id" : ObjectId("4c85feea145a9b0b340032ef") } on : shard0000 { "t" :6000, "i" : 4 }
{ "files_id" : ObjectId("4c85feea145a9b0b340032ef") } -->> { "files_id" : ObjectId("4c85ff25145a9b0b34003cff") } on : shard0000 { "t" :7000, "i" : 1 }
{ "files_id" : ObjectId("4c85ff25145a9b0b34003cff") } -->> { "files_id" : ObjectId("4c85ff57145a9b0b3400470f") } on : shard0001 { "t" :7000, "i" : 4 }
{ "files_id" : ObjectId("4c85ff57145a9b0b3400470f") } -->> { "files_id" : ObjectId("4c85ff87145a9b0b3400511f") } on : shard0001 { "t" :8000, "i" : 1 }
{ "files_id" : ObjectId("4c85ff87145a9b0b3400511f") } -->> { "files_id" : ObjectId("4c85ffcd145a9b0b34005b2f") } on : shard0000 { "t" :8000, "i" : 16 }
{ "files_id" : ObjectId("4c85ffcd145a9b0b34005b2f") } -->> { "files_id" : ObjectId("4c85fff7145a9b0b3400653f") } on : shard0000 { "t" :8000, "i" : 17 }
{ "files_id" : ObjectId("4c85fff7145a9b0b3400653f") } -->> { "files_id" : ObjectId("4c860021145a9b0b34006f4f") } on : shard0000 { "t" :8000, "i" : 18 }
{ "files_id" : ObjectId("4c860021145a9b0b34006f4f") } -->> { "files_id" : ObjectId("4c86004f145a9b0b3400795f") } on : shard0000 { "t" :8000, "i" : 19 }
{ "files_id" : ObjectId("4c86004f145a9b0b3400795f") } -->> { "files_id" : ObjectId("4c860080145a9b0b3400836f") } on : shard0000 { "t" :9000, "i" : 1 }
{ "files_id" : ObjectId("4c860080145a9b0b3400836f") } -->> { "files_id" : ObjectId("4c8600b5145a9b0b34008d7f") } on : shard0001 { "t" :9000, "i" : 7 }
{ "files_id" : ObjectId("4c8600b5145a9b0b34008d7f") } -->> { "files_id" : ObjectId("4c860115145a9b0b3400a183") } on : shard0001 { "t" :9000, "i" : 8 }
{ "files_id" : ObjectId("4c860115145a9b0b3400a183") } -->> { "files_id" : ObjectId("4c860198145a9b0b3400b587") } on : shard0001 { "t" :10000, "i" : 1 }
{ "files_id" : ObjectId("4c860198145a9b0b3400b587") } -->> { "files_id" : ObjectId("4c8601fc145a9b0b3400c98b") } on : shard0000 { "t" :10000, "i" : 11 }
{ "files_id" : ObjectId("4c8601fc145a9b0b3400c98b") } -->> { "files_id" : ObjectId("4c86025b145a9b0b3400dd8f") } on : shard0000 { "t" :10000, "i" : 12 }
{ "files_id" : ObjectId("4c86025b145a9b0b3400dd8f") } -->> { "files_id" : ObjectId("4c8602ca145a9b0b3400f193") } on : shard0000 { "t" :10000, "i" : 13 }
{ "files_id" : ObjectId("4c8602ca145a9b0b3400f193") } -->> { "files_id" : ObjectId("4c860330145a9b0b34010597") } on : shard0000 { "t" :10000, "i" : 14 }
{ "files_id" : ObjectId("4c860330145a9b0b34010597") } -->> { "files_id" : { $maxKey : 1 } } on : shard0000 { "t" : 10000, "i" : 15 }
當(dāng)前,綜合比較,發(fā)現(xiàn)還是chunks的值要遠(yuǎn)大于files集合所占用的磁盤(pán)空間(前者存儲(chǔ)文件二進(jìn)制流信息,后者存儲(chǔ)結(jié)構(gòu)化數(shù)據(jù)信息(如文件名稱(chēng)大小等):

下面是一個(gè)測(cè)試,用于讀寫(xiě)shard0001(注意不是shard0000)上的圖片數(shù)據(jù),因?yàn)閙ongos可以很好的管理sharding下各分區(qū)下的數(shù)據(jù)chunk,所以我們只要告訴它要取的文件名稱(chēng)即可:)
比如要獲取"2010/09/07/2/2856090617370.gif6243"這個(gè)文件(帶日期型文件路徑只是一種格式,因?yàn)槲覀兊漠a(chǎn)品會(huì)將上傳的附件放到相應(yīng)磁盤(pán)目錄下,這種帶路徑的命名方式會(huì)方便與磁盤(pán)路徑進(jìn)行對(duì)應(yīng)),其目前位于shard0001中,我們只要通過(guò)下面html代碼即可獲取圖文件信息:
<img src="getfile.aspx?filename=2010/09/07/2/2856090617370.gif6243" width="30" />
相應(yīng)的getfile.aspx.cs 代碼參見(jiàn)如下:
public partial class getfile : System.Web.UI.Page
{
public Mongo Mongo { get; set; }
public IMongoDatabase DB
{
get
{
return this.Mongo["dnt_mongodb"];
}
}
/// <summary>
/// Sets up the test environment. You can either override this OnInit to add custom initialization.
/// </summary>
public virtual void Init()
{
string ConnectionString = "Server=10.0.4.85:27017;ConnectTimeout=30000;ConnectionLifetime=300000;MinimumPoolSize=512;MaximumPoolSize=51200;Pooled=true";
if (String.IsNullOrEmpty(ConnectionString))
throw new ArgumentNullException("Connection string not found.");
this.Mongo = new Mongo(ConnectionString);
this.Mongo.Connect();
}
protected void Page_Load(object sender, EventArgs e)
{
if (!string.IsNullOrEmpty(Request.QueryString["filename"]))
{
string filename = Request.QueryString["filename"];
Init();
String filesystem = "attach_gfstream";
GridFile fs = new GridFile(DB, filesystem);
GridFileStream gfs = fs.OpenRead(filename);
Byte[] buffer = new Byte[gfs.Length];
//下面的Expires和Cache-Control設(shè)置主要用于squid反向加速,更多內(nèi)容參見(jiàn) http://www.cnblogs.com/daizhj/archive/2010/08/19/1803454.html
HttpContext.Current.Response.AddHeader("Expires", DateTime.Now.AddDays(20).ToString("r"));
HttpContext.Current.Response.AddHeader("Cache-Control", "public");
// 需要讀的數(shù)據(jù)長(zhǎng)度
long dataToRead = gfs.Length;
int length;
while (dataToRead > 0)
{
// 檢查客戶(hù)端是否還處于連接狀態(tài)
if (HttpContext.Current.Response.IsClientConnected)
{
length = gfs.Read(buffer, 0, 10000);
HttpContext.Current.Response.OutputStream.Write(buffer, 0, length);
HttpContext.Current.Response.Flush();
buffer = new Byte[10000];
dataToRead = dataToRead - length;
}
else
{
// 如果不再連接則跳出死循環(huán)
dataToRead = -1;
}
}
gfs.Dispose();
this.Mongo.Disconnect();
HttpContext.Current.Response.End();
}
}
}
當(dāng)然,上面只是對(duì)chunks進(jìn)行sharding,如果要對(duì)files集合分片時(shí),可以用下面ming令行:
> db.runCommand( { shardcollection : "dnt_mongodb.attach_gfstream.files", key : { _id : 1 } } )
{ "collectionsharded" : "dnt_mongodb.attach_gfstream.files", "ok" : 1 }
在我添加了近50萬(wàn)記錄后,mongos開(kāi)始將新的文件信息保存到shard0001上,如下圖:

可以使用如下ming令行來(lái)查看 shard上的信息:
> db.printShardingStatus()
.../省略之前files_id的shard信息
{ "filename" : { $minKey : 1 } } -->> { "filename" : "2010//09//08//2//1393993713076.gif1" } on : shard0000 { "t" : 1000, "i" : 6 }
{ "filename" : "2010//09//08//2//1393993713076.gif1" } -->> { "filename" : "2010//09//08//2//2396571814760.gif9999" } on : shard0000 { "t" : 1000, "i" : 7 }
{ "filename" : "2010//09//08//2//2396571814760.gif9999"} -->> { "filename" : "2010//09//08//2//2819270318096.gif25366" } on : shard0000 { "t" : 2000, "i" : 2 }
{ "filename" : "2010//09//08//2//2819270318096.gif25366" } -->> { "filename" : "2010//09//08//2//3100748419355.gif999" } on : shard0000{ "t" : 2000, "i" : 3 }
{ "filename" : "2010//09//08//2//3100748419355.gif999" } -->> { "filename" : { $maxKey : 1 } } on : shard0001 { "t" : 2000, "i" : 0 }
下面是mongos上進(jìn)行sharding時(shí)的信息:
Wed Sep 08 17:25:44 [conn5] ns: dnt_mongodb.attach_gfstream.files ClusteredCursor::query ShardConnection had to change attempt: 0
Wed Sep 08 17:32:34 [conn6] ns: dnt_mongodb.attach_gfstream.files ClusteredCursor::query ShardConnection had to change attempt: 0
Wed Sep 08 17:38:49 [conn55] autosplitting dnt_mongodb.attach_gfstream.chunks size: 188884488 shard: ns:dnt_mongodb.attach_gfstream.chunks at: shard0001:10.0.4.85:27021 lastmod: 11|3 min: { files_id: ObjectId('4c8755b3145a9b16d41d5dc9') } m
ax: { files_id: MaxKey } on: { files_id: ObjectId('4c8759a5145a9b16d42300d7') }(splitThreshold 188743680)
Wed Sep 08 17:38:49 [conn55] config change: { _id: "4_85-2010-09-08T09:38:49-10", server: "4_85", time: new Date(1283938729648), what: "split", ns: "dnt_mongodb.attach_gfstream.chunks", details: { before: { min: { files_id: ObjectId('4c8755
b3145a9b16d41d5dc9') }, max: { files_id: MaxKey } }, left: { min: { files_id: ObjectId('4c8755b3145a9b16d41d5dc9') }, max: { files_id: ObjectId('4c8759a5145a9b16d42300d7') } }, right: { min: { files_id: ObjectId('4c8759a5145a9b16d42300d7')
}, max: { files_id: MaxKey } } } }
Wed Sep 08 17:38:49 [conn98] ns: dnt_mongodb.attach_gfstream.chunks ClusteredCursor::query ShardConnection had to change attempt: 0
如果訪(fǎng)問(wèn)的圖片分別位于shard0000和shard0001時(shí),mongos會(huì)自行將請(qǐng)求調(diào)度到相應(yīng)sharding上,比如下面的鏈接文件分別指定shard000和shard0001:
<img src="getfile.aspx?filename=2010/09/08/2/1393993713076.gif5" width="30" /> 位于shard0000
<img src="getfile.aspx?filename=2010/09/08/2/3197962515515.gif9" width="30" /> 位于shard0001
好了,今天的文章就先到這里了。