您的位置：首页 > 百科 > 正文

现代信息检索(英文第2版)

《现代信息检索(英文第2版)》是2011年机械工业出版社出版的图书，作者是(西班牙)RicardoBaeza-Yates，(巴西)BerthierRibeiro-Neto。

中文名现代信息检索（英文第2版）
上架时间 2011年3月7日
出版社机械工业出版社
ISBN 9787111331742
出版日期 2011年3月

内容简介

　　《现代信息检索(英文方写食农设谁倍与东注沉版.第2版)》详细介绍带斗了信息检索的所有主要概念和来自技术，以及有关信息检索方面的所有新变化，使读者既可以对现代信息命顶因图条许福看副东检索有一个全面的了解，又可以获取现代信息检索所有关键主题的详细知识。《现代信息检索(英文版.第2版)》的主要内容由信息检索领域的代表人居温矿甚太反分向袁全物baeza-yates和ribeiro-neto编著;对于那些希望深入研究关键领域的读者，《现代信息检索(英文版.第2版)》中还提供了由其他主要研究人员编写的关于特殊主题的发展现状。

　　与上一版相比，《现代信息检索(英文版.第2版)》在内容和结构上都有大量调整、更新和充实，其中新增内容在60%到70%左右。具体更新情况如下:

　　·新增了文本分类、网络信息爬取、结构化文本检索和企业搜索等章材弱称节，以及关于开源搜索的一个附录。

　　·全面改写了用户界面、多媒体检索和数字图书馆等内容。

　　·拓品袁尼穿快松丝系聚体展了一些章节，介绍了信息检索方面的新的重要进展，古们稳略如语言模型、新的评价方法、查询的特点、基于聚类和分布式信息检索等。

目录

　　1 introduct360百科ion 1

　　1.1 information retrieval 1

　　1.1.1 early developments 1

　　1.1.2 inf校怀走其通ormation retrie尼上控福val in libraries and digital libraries 3

　　1.1.3 ir at the center o了吃因构f the stage 3

　　1.2 the ir problem 3

　　1.2.1 the user's task 4

　　1.2.2 information versus data retrie重轮身科州帮斗想val 5

　　1.3 the ir 那表点神system 5

　不跳想收等著沿式氢　1.3.1 software architecture of the ir system 5

　　布混跟温七免旧粉右1.3.2 the retrieval and ranki而额剂求后树确吧题ng processe着s 7

　　1.4 theweb 8

　　1.4.1 a brief history 8

　　1.4.2 the e-publishing era 9

　　1未王模句呀叫益地危善品.4.3 how the web changed search 10

　　1.4.4 practical issues on the web 12

　　1.5 organ扩双请费ization of the book 12

　　1.5.1 focus of the book 12

　　1.5.2 book contents 13

　　1.6 the book web site: a teaching resource 16

　　.1.7 bibliographic discussion 17

　　2 user interfaces for search 21

　　by marti hearst

　　2.1 introduction 21

　　2.2 how people search 21

　　preface to the second edition v

　　preface to the first edition vii

　　authors' acknowledgements to the second edition viii

　　authors' acknowledgements to the first edition x

　　publishers' acknowledgements xii

　　contents xvii

　　2.2.1 information lookup versus exploratory search 22

　　2.2.2 classic versus dynamic model of information seeking 23

　　2.2.3 navigation versus search 24

　　2.2.4 observations of the search process 24

　　2.3 search interfaces today 25

　　2.3.1 getting started 25

　　2.3.2 query specification 26

　　2.3.3 query specification interfaces 27

　　2.3.4 retrieval results display 29

　　2.3.5 query reformulation 32

　　2.3.6 organizing search results 35

　　2.4 visualization in search interfaces 40

　　2.4.1 visualizing boolean syntax 42

　　2.4.2 visualizing query terms within retrieval results 43

　　2.4.3 visualizing relationships among words and documents 47

　　2.4.4 visualization for text mining 49

　　2.5 design and evaluation of search interfaces 50

　　2.6 trends and research issues 54

　　2.7 bibliographic discussion 54

　　3 modeling 57

　　3.1 ir models 57

　　3.1.1 modeling and ranking 57

　　3.1.2 characterization of an ir model 58

　　3.1.3 a taxonomy of ir models 59

　　3.2 classic information retrieval 61

　　3.2.1 basic concepts 61

　　3.2.2 the boolean model 64

　　3.2.3 term weighting 66

　　3.2.4 tf-idf weights 68

　　3.2.5 document length normalization 75

　　3.2.6 the vector model 77

　　3.2.7 the probabilistic model 79

　　3.2.8 brief comparison of classic models 86

　　3.3 alternative set theoretic models 87

　　3.3.1 set-based model 87

　　3.3.2 extended boolean model 92

　　3.3.3 fuzzy set model 95

　　3.4 alternative algebraic models 98

　　3.4.1 generalized vector space model 98

　　3.4.2 latent semantic indexing model 101

　　3.4.3 neural network model 102

　　3.5 alternative probabilistic models 104

　　3.5.1 bm25 104

　　3.5.2 language models 107

　　3.5.3 divergence from randomness 113

　　3.5.4 bayesian network models 116

　　3.6 other models 124

　　3.6.1 the hypertext model 124

　　3.6.2 web based models 125

　　3.6.3 structured text retrieval 126

　　3.6.4 multimedia retrieval 126

　　3.6.5 enterprise and vertical search 126

　　3.7 trends and research issues 127

　　3.8 bibliographic discussion 128

　　4 retrieval evaluation 131

　　4.1 introduction 131

　　4.2 the cranfield paradigm 132

　　4.2.1 a brief history 132

　　4.2.2 reference collections 134

　　4.3 retrieval metrics 134

　　4.3.1 precision and recall 135

　　4.3.2 single value summaries: p@n, map, mrr, f 139

　　4.3.3 user-oriented measures 144

　　4.3.4 dcg: discounted cumulated gain 145

　　4.3.5 bpref: binary preferences 150

　　4.3.6 rank correlation metrics 153

　　4.4 reference collections 158

　　4.4.1 the trec collections 159

　　4.4.2 other reference collections 166

　　4.4.3 other small test collections 167

　　4.5 user-based evaluation 168

　　4.5.1 human experimentation in the lab 168

　　4.5.2 side-by-side panels 168

　　4.5.3 a/b testing 169

　　4.5.4 crowdsourcing 170

　　4.5.5 evaluation using clickthrough data 171

　　4.6 practical caveats 173

　　4.7 trends and research issues 174

　　4.8 bibliographic discussion 174

　　5 relevance feedback and query expansion 177

　　5.1 introduction 177

　　5.2 a framework for feedback methods 178

　　5.3 explicit relevance feedback 180

　　5.3.1 relevance feedback for the vector model: rocchio method 181

　　5.3.2 relevance feedback for the probabilistic model 183

　　5.3.3 evaluation of relevance feedback 184

　　5.4 explicit feedback through clicks 185

　　5.4.1 eye tracking and relevance judgements 185

　　5.4.2 user behavior 186

　　5.4.3 clicks as a metric of user preferences 187

　　5.5 implicit feedback through local analysis 190

　　5.5.1 implicit feedback through local clustering 190

　　5.5.2 implicit feedback through local context analysis 193

　　xviii contents

　　5.6 implicit feedback through global analysis 195

　　5.6.1 query expansion based on a similarity thesaurus 195

　　5.6.2 query expansion based on a statistical thesaurus 198

　　5.7 trends and research issues 200

　　5.8 bibliographic discussion 200

　　6 documents: languages & properties 203

　　with gonzalo navarro and nivio ziviani

　　6.1 introduction 203

　　6.2 metadata 205

　　6.3 document formats 206

　　6.3.1 text 206

　　6.3.2 multimedia 207

　　6.3.3 graphics and virtual reality 208

　　6.4 markup languages 208

　　6.4.1 sgml 209

　　6.4.2 html 211

　　6.4.3 xml 214

　　6.4.4 rdf: resource description framework 216

　　6.4.5 hytime 217

　　6.5 text properties 218

　　6.5.1 information theory 218

　　6.5.2 modeling natural language 219

　　6.5.3 text similarity 222

　　6.6 document preprocessing 223

　　6.6.1 lexical analysis of the text 224

　　6.6.2 elimination of stopwords 226

　　6.6.3 stemming 226

　　6.6.4 keyword selection 227

　　6.6.5 thesauri 228

　　6.7 organizing documents 231

　　6.7.1 taxonomies 231

　　6.7.2 folksonomies 232

　　6.8 text compression 233

　　6.8.1 basic concepts 234

　　6.8.2 statistical methods 234

　　6.8.3 statistical methods: modeling 235

　　6.8.4 statistical methods: coding 238

　　6.8.5 dictionary methods 245

　　6.8.6 preprocessing for compression 246

　　6.8.7 comparing text compression techniques 248

　　6.8.8 structured text compression 249

　　6.9 trends and research issues 250

　　6.10 bibliographical discussion 253

　　7 queries: languages & properties 255

　　with gonzalo navarro

　　7.1 query languages 255

　　contents xix

　　7.1.1 keyword-based querying 256

　　7.1.2 beyond keywords 259

　　7.1.3 structural queries 262

　　7.1.4 query protocols 265

　　7.2 query properties 267

　　7.2.1 characterizing web queries 267

　　7.2.2 user search behavior 269

　　7.2.3 query intent 270

　　7.2.4 query topic 272

　　7.2.5 query sessions and missions 273

　　7.2.6 query difficulty 274

　　7.3 trends and research issues 278

　　7.4 bibliographical discussion 279

　　8 text classification 281

　　with marcos gon?calves

　　8.1 introduction 281

　　8.2 a characterization of text classification 282

　　8.2.1 machine learning 282

　　8.2.2 the text classification problem 283

　　8.2.3 text classification algorithms 284

　　8.3 unsupervised algorithms 286

　　8.3.1 clustering 286

　　8.3.2 naive text classification 290

　　8.4 supervised algorithms 291

　　8.4.1 decision trees 294

　　8.4.2 the k-nn classifier 299

　　8.4.3 the rocchio classifier 300

　　8.4.4 probabilistic naive bayes document classification 303

　　8.4.5 the svm classifier 306

　　8.4.6 ensemble classifiers 316

　　8.4.7 final remarks on supervised algorithms 319

　　8.5 feature selection or dimensionality reduction 320

　　8.5.1 term–class incidence table 321

　　8.5.2 term document frequency 322

　　8.5.3 tf-idf weights 322

　　8.5.4 mutual information 323

　　8.5.5 information gain 323

　　8.5.6 chi square 324

　　8.5.7 impact of feature selection 325

　　8.6 evaluation metrics 325

　　8.6.1 contingency table 325

　　8.6.2 accuracy and error 326

　　8.6.3 precision and recall 327

　　8.6.4 f-measure and f1 327

　　8.6.5 cross-validation 329

　　8.6.6 standard collections 329

　　8.7 organizing the classes – building taxonomies 330

　　xx contents

　　8.8 trends and research issues 333

　　8.9 bibliographic discussion 334

　　9 indexing and searching 337

　　with gonzalo navarro

　　9.1 introduction 337

　　9.2 inverted indexes 340

　　9.2.1 basic concepts 340

　　9.2.2 full inverted indexes 341

　　9.2.3 searching 345

　　9.2.4 ranking 348

　　9.2.5 construction 351

　　9.2.6 compressed inverted indexes 354

　　9.2.7 structural queries 357

　　9.3 signature files 357

　　9.4 suffix trees and suffix arrays 360

　　9.4.1 structure: tries and suffix trees 361

　　9.4.2 searching for simple strings 362

　　9.4.3 searching for complex patterns 363

　　9.4.4 construction 365

　　9.4.5 compressed suffix arrays 367

　　9.5 sequential searching 372

　　9.5.1 simple strings: horspool 373

　　9.5.2 complex patterns: automata and bit-parallelism 375

　　9.5.3 faster bit-parallel algorithms 379

　　9.5.4 regular expressions 382

　　9.5.5 multiple patterns 384

　　9.5.6 approximate searching 385

　　9.5.7 searching compressed text 389

　　9.6 multi-dimensional indexing 391

　　9.7 trends and research issues 393

　　9.8 bibliographic discussion 394

　　10 parallel and distributed ir 399

　　with eric brown

　　10.1 introduction 399

　　10.2 a taxonomy of distributed ir systems 402

　　10.3 data partitioning 404

　　10.3.1 collection partitioning 405

　　10.3.2 collection selection 407

　　10.3.3 inverted index partitioning 409

　　10.3.4 partitioning other indexes 413

　　10.4 parallel ir 414

　　10.4.1 introduction 414

　　10.4.2 parallel ir on mimd architectures 416

　　10.4.3 parallel ir on simd architectures 418

　　10.5 cluster-based ir 423

　　10.6 distributed ir 424

　　contents xxi

　　10.6.1 introduction 424

　　10.6.2 indexing 428

　　10.6.3 query processing 431

　　10.6.4 web issues 437

　　10.7 federated search 438

　　10.8 retrieval in peer-to-peer networks 440

　　10.9 trends and research issues 444

　　10.10bibliographic discussion 445

　　11 web retrieval 447

　　with yoelle maarek

　　11.1 introduction 447

　　11.2 a challenging problem 449

　　11.3 the web 451

　　11.3.1 characteristics 451

　　11.3.2 structure of the web graph 452

　　11.3.3 modeling the web 454

　　11.3.4 link analysis 456

　　11.4 search engine architectures 458

　　11.4.1 basic architecture 458

　　11.4.2 cluster-based architecture 459

　　11.4.3 caching 462

　　11.4.4 multiple indexes 464

　　11.4.5 distributed architectures 466

　　11.5 search engine ranking 468

　　11.5.1 ranking signals 469

　　11.5.2 link-based ranking 470

　　11.5.3 simple ranking functions 473

　　11.5.4 learning to rank 473

　　11.5.5 learning the ranking function 474

　　11.5.6 quality evaluation 475

　　11.5.7 web spam 476

　　11.6 managing web data 477

　　11.6.1 assigning identifiers to documents 477

　　11.6.2 metadata 478

　　11.6.3 compressing the web graph 478

　　11.6.4 handling duplicated data 479

　　11.7 search engine user interaction 480

　　11.7.1 the search rectangle paradigm 481

　　11.7.2 the search engine result page 488

　　11.7.3 educating the user 497

　　11.8 browsing 498

　　11.8.1 flat browsing 499

　　11.8.2 structure guided browsing and web directories 499

　　11.9 beyond browsing 501

　　11.9.1 hypertext and the web 501

　　11.9.2 combining searching with browsing 501

　　11.9.3 web query languages 503

　　xxii contents

　　11.9.4 dynamic search 503

　　11.10related problems 504

　　11.10.1 computational advertising 504

　　11.10.2web mining 506

　　11.10.3 metasearch 508

　　11.11trends and research issues 509

　　11.11.1 beyond static text data 509

　　11.11.2 current challenges 511

　　11.12bibliographical discussion 513

　　12 web crawling 515

　　with carlos castillo

　　12.1 introduction 515

　　12.2 applications of a web crawler 517

　　12.2.1 general web search 517

　　12.2.2 topical crawling 518

　　12.2.3 web characterization 518

　　12.2.4 mirroring 518

　　12.2.5 web site analysis 519

　　12.3 a taxonomy of crawlers 519

　　12.3.1 types of web pages 520

　　12.4 architecture and implementation 521

　　12.4.1 crawler architecture 521

　　12.4.2 practical issues 523

　　12.4.3 parallel crawling 526

　　12.5 scheduling algorithms 527

　　12.5.1 selection policy 528

　　12.5.2 revisit policy 530

　　12.5.3 politeness policy 535

　　12.5.4 combining policies 538

　　12.6 evaluation 539

　　12.6.1 evaluating network usage 539

　　12.6.2 evaluating long-term scheduling 540

　　12.7 trends and research issues 541

　　12.7.1 crawling the "hidden" web 541

　　12.7.2 crawling with the help of web sites 542

　　12.7.3 distributed crawling 543

　　12.8 bibliographic discussion 543

　　13 structured text retrieval 545

　　with mounia lalmas

　　13.1 introduction 545

　　13.2 structuring power 546

　　13.2.1 explicit vs. implicit structure 546

　　13.2.2 static vs. dynamic structure 547

　　13.2.3 single hierarchy vs. multiple hierarchies 548

　　13.3 early text retrieval models 549

　　13.3.1 model based on non-overlapping lists 549

　　contents xxiii

　　13.3.2 model based on proximal nodes 550

　　13.3.3 ranking structured text results 551

　　13.4 xml retrieval 551

　　13.4.1 challenges in xml retrieval 551

　　13.4.2 indexing strategies 553

　　13.4.3 ranking strategies 554

　　13.4.4 removing overlaps 565

　　13.5 xml retrieval evaluation 566

　　13.5.1 document collections 566

　　13.5.2 topics 567

　　13.5.3 retrieval tasks 568

　　13.5.4 relevance 569

　　13.5.5 measures 571

　　13.6 query languages 573

　　13.6.1 characteristics 574

　　13.6.2 classification of xml query languages 575

　　13.6.3 examples of xml query languages 577

　　13.7 trends and research issues 582

　　13.8 bibliographic discussion 585

　　14 multimedia information retrieval 587

　　by dulce poncele′on and malcolm slaney

　　14.1 introduction 587

　　14.1.1 what is multimedia? 587

　　14.1.2 multimedia ir 588

　　14.1.3 text ir versus multimedia ir 589

　　14.2 the challenges 589

　　14.2.1 the semantic gap 589

　　14.2.2 feature ambiguity 591

　　14.2.3 machine-generated data 591

　　14.3 content-based image retrieval 592

　　14.3.1 color-based retrieval 593

　　14.3.2 texture 593

　　14.3.3 salient points 596

　　14.4 audio and music retrieval 597

　　14.4.1 fingerprinting 598

　　14.4.2 speech recognition 599

　　14.4.3 speaker identification 601

　　14.4.4 spoken document retrieval 602

　　14.4.5 audio basics 602

　　14.5 retrieving and browsing video 606

　　14.5.1 video abstracts 606

　　14.5.2 static summaries 607

　　14.5.3 mosaics and salient stills 608

　　14.5.4 dynamic summaries 609

　　14.5.5 interactive summaries 611

　　14.5.6 visual vs. audio browsing 612

　　14.5.7 evaluating summaries 613

　　xxiv contents

　　14.6 fusion models: combining it all 614

　　14.6.1 naming faces 614

　　14.6.2 naming images 615

　　14.6.3 naming audio 616

　　14.6.4 combining audio and video for avsr 617

　　14.6.5 combining audio and video for multimedia 620

　　14.7 segmentation 620

　　14.7.1 a video segmentation example 620

　　14.7.2 segmentation schemes for video 622

　　14.7.3 video segmentation with edges 623

　　14.7.4 speech segmentation 624

　　14.7.5 segmentation evaluation 625

　　14.8 compression and mpeg standards 625

　　14.8.1 intensity and sampling 626

　　14.8.2 color 626

　　14.8.3 lossy compression 628

　　14.8.4 lossless compression 628

　　14.8.5 temporal redundancy 630

　　14.8.6 motion prediction 631

　　14.8.7 mpeg standards 633

　　14.9 trends and research issues 636

　　14.10bibliographic discussion 637

　　15 enterprise search 641

　　by david hawking

　　15.1 introduction 641

　　15.1.1 characteristics and applications of enterprise search 642

　　15.1.2 enterprise search software 643

　　15.1.3 workplace search 644

　　15.2 enterprise search tasks 644

　　15.2.1 examples of search-supported tasks 644

　　15.2.2 search types 647

　　15.2.3 studying enterprise search 647

　　15.3 architecture of enterprise search systems 648

　　15.3.1 gathering 648

　　15.3.2 extracting 651

　　15.3.3 indexing 652

　　15.3.4 indexing textual annotations 653

　　15.3.5 query processing 654

　　15.3.6 presentation of search results 655

　　15.3.7 security models 657

　　15.3.8 federation/metasearch 659

　　15.4 enterprise search evaluation 662

　　15.4.1 published test collections for enterprise search 662

　　15.4.2 internal enterprise search evaluations 663

　　15.4.3 enterprise search tuning 665

　　15.4.4 what is it reasonable to expect? 666

　　15.5 potential reasons for dissatisfaction 667

　　contents xxv

　　15.6 context and personalization 668

　　15.6.1 controls and levers for contextualization 671

　　15.6.2 contextualization: local, enterprise or global? 675

　　15.6.3 privacy of profiles 676

　　15.6.4 defining, creating and maintaining a profile 677

　　15.6.5 user modeling 677

　　15.6.6 implicit measures 679

　　15.6.7 information filtering 679

　　15.6.8 social recommender systems 680

　　15.7 trends and research issues 681

　　15.8 bibliographic discussion 681

　　16 library systems 685

　　by edie rasmussen

　　16.1 the information environment in the library 685

　　16.2 online public access catalogues 687

　　16.2.1 opacs and bibliographic records 689

　　16.2.2 information retrieval from the ils 691

　　16.2.3 integrating the hybrid library 693

　　16.2.4 opacs and end users 694

　　16.2.5 ils: vendors and products 695

　　16.3 ir systems and document databases 697

　　16.3.1 bibliographic and full-text databases 698

　　16.3.2 content of database records 698

　　16.3.3 the online industry: database vendors 701

　　16.3.4 information retrieval from document databases 702

　　16.4 information retrieval in organizations 706

　　16.5 trends and research issues 708

　　16.6 bibliographic discussion 709

　　17 digital libraries 711

　　by marcos gon?calves

　　17.1 introduction 711

　　17.2 defining digital libraries 712

　　17.3 a general architecture 713

　　17.4 fundamentals 714

　　17.4.1 digital objects and collections 714

　　17.4.2 metadata and catalogs 716

　　17.4.3 repositories/archives 719

　　17.4.4 services 723

　　17.5 social-economical issues 725

　　17.5.1 social issues 725

　　17.5.2 economical issues 726

　　17.6 software systems 727

　　17.6.1 greenstone 728

　　17.6.2 eprints 728

　　17.6.3 dspace 728

　　17.6.4 fedora 729

　　xxvi contents

　　17.6.5 open digital libraries 729

　　17.6.6 the 5s suite 730

　　17.7 dl case studies 731

　　17.7.1 the networked dl of theses and dissertations 731

　　17.7.2 the national science digital library 732

　　17.7.3 the etana-dl archaeological digital library 732

　　17.8 trends and research issues 733

　　17.8.1 evaluation 733

　　17.8.2 integration 733

　　17.8.3 other research challenges 734

　　17.9 bibliographic discussion 735

　　a open source search engines 737

　　with christian middleton

　　a.1 introduction 737

　　a.2 search engines 738

　　a.2.1 preliminary selection of search engines 738

　　a.2.2 features 741

　　a.2.3 evaluation 742

　　a.3 methodology 743

　　a.3.1 document collections 743

　　a.3.2 evaluation tests 744

　　a.3.3 experimental setup 744

　　a.4 experimental results 745

　　a.4.1 test a – indexing 745

　　a.4.2 test b – incremental indexing 749

　　a.4.3 test c – search performance 749

　　a.4.4 global evaluation 752

　　a.5 conclusions 753

　　b biographies 755

　　references 761

　　index 893

　　contents xxvii

本文地址：http://www.dahuhg.com/show-39-448675-0.html

文章标签：

版权声明：此文信息来源于网络，登载此文只为提供信息参考，并不用于任何商业目的。如有侵权，请及时联系我们：sji1127@163.com

2023-02-19 18:40:21 百科

上一篇小儿复方磺胺

下一篇mfc71u.dll

发表评论

评论列表